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Preface 


This volume contains the papers presented at CALDAM 2022 (the 8th International 
Conference on Algorithms and Discrete Applied Mathematics) held during February 
10-12, 2022, at Pondicherry University, Puducherry, India. CALDAM 2022 was 
organized by the Department of Mathematics, Pondicherry University, and the 
Association for Computer Science and Discrete Mathematics (ACSDM), India. The 
program committee consisted of 34 highly experienced and active researchers from 
various countries. 

The conference topics included algorithms, graph theory, computational geometry, 
and optimization. We received 80 submissions from authors from all over the world. 
Each paper was extensively reviewed by program committee members and other expert 
reviewers. The committee decided to accept 24 papers for presentation. The program 
included three Google invited talks by Timothy M. Chan (University of Illinois at 
Urbana-Champaign), DayaR. Gaur (University of Lethbridge), and Joseph S. B. Mitchell 
(Stony Brook University). 

As volume editors, we would like to thank the authors of all submissions for 
considering CALDAM 2022 for the potential presentation of their works. We are very 
much indebted to the program committee members and the external reviewers for 
providing serious reviews within a very short period. We thank Springer for publishing 
the proceedings in the Lecture Notes in Computer Science series. Our sincerest thanks 
to the invited speakers, Timothy M. Chan, Daya R. Gaur, and Joseph S. B. Mitchell, for 
accepting our invitation to give a talk. We thank the organizing committee, chaired by 
S. Francis Raj of Pondicherry University, for conducting CALDAM 2022 smoothly, and 
Pondicherry University for providing the necessary facilities. We are very grateful to the 
chair of the steering committee, Subir Ghosh, for his active help, support, and guidance. 
And, we thank the Program Committee co-chairs of CALDAM 2021, Apurva Mudgal 
and C R Subrahmanyam, for their timely input throughout. We thank our sponsors, 
Google Inc. for their financial support and Springer for the best paper presentation 
awards. We also thank Springer OCS staff for their support. 


February 2022 Niranjan Balachandran 
R. Inkulu 


Organization 


Steering Committee 
Subir Kumar Ghosh (Chair) 
Gyula O. H. Katona 

Janos Pach 


Nicola Santoro 
Swami Sarvattomananda 


Chee Yap 


Program Committee 


Amitabha Bagchi 
Niranjan Balachandran (Co-chair) 
BoStjan BreSar 
Sergio Cabello 

Paz Carmi 

Manoj Changat 
Sandip Das 

Josep Diaz 

Martin Furer 

Daya Gaur 

Sathish Govindarajan 
Pavol Hell 

R. Inkulu (Co-chair) 
Subrahmanyam Kalyanasundaram 
Van Bang Le 

Sanjiv Kapoor 
Andrzej Lingas 

Anil Maheshwari 
Bodo Manthey 
Rogers Mathew 
Bojan Mohar 

Apurva Mudgal 


Ramakrishna Mission Vivekananda Educational 
and Research Institute, India 

Alfréd Rényi Institute of Mathematics, Hungarian 
Academy of Sciences, Hungary 

Ecole Polytechnique Fédérale De Lausanne 
(EPFL), Switzerland 

Carleton University, Canada 

Ramakrishna Mission Vivekananda Educational 
and Research Institute, India 


Courant Institute of Mathematical Sciences, 
New York University, USA 


IIT Delhi, India 

IIT Bombay, India 

University of Maribor, Slovenia 
University of Ljubljana, Slovenia 
Ben-Gurion University of the Negev, Israel 
University of Kerala, India 

ISI Kolkata, India 

Polytechnic University of Catalonia, Spain 
Pennsylvania State University, USA 
University of Lethbridge, Canada 

IISc Bangalore, India 

Simon Fraser University, Canada 

IIT Guwahati, India 

IIT Hyderabad, India 

University of Rostock, Germany 
Illinois Institute of Technology, USA 
Lund University, Sweden 

Carleton University, Canada 
University of Twente, The Netherlands 
IIT Hyderabad, India 

Simon Fraser University, Canada 

IIT Ropar, India 


Viii Organization 


Rahul Muthu 
Kamal Lochan Patra 
Iztok Peterin 
Valentin Polishchuk 
Deepak Rajendraprasad 
Abhiram Ranade 
Sagnik Sen 

Rishi Ranjan Singh 
Michiel Smid 
Joachim Spoerhase 
C. R. Subramanian 
Antoine Vigneron 


Organizing Committee 


Rajeswari Seshadri 

Malai Subbiah 

S. R. Kannan 

T. Duraivel 

A. Joseph Kennedy 

S. Francis Raj (Chair) 
Syeda Noor Fathima 

I. Subramania Pillai 

Swami Dhyanagamyananda 


Pritee Khanna 
Arti Pandey 
Tarkeshwar Singh 


Additional Reviewers 


Ankush Acharyya 

N. R. Arvind 

Devsi Bantva 

Manu Basavaraju 
Srimanta Bhattacharya 
Sriram Bhyravarapu 
Arun Das 

Hiranya Kishore Dey 
Amit Kumar Dhar 
Tanja Dravec 

Barun Gorain 

Vinod Reddy I. 


DA-IICT, India 

NISER Bhubaneswar, India 
University of Maribor, Slovenia 
Link6ping University, Sweden 
IIT Palakkad, India 

IIT Bombay, India 

IIT Dharwad, India 

IIT Bhilai, India 

Carleton University, Canada 
University of Wurzburg, German 
IMSc Chennai, India 

UNIST, South Korea 


Pondicherry University, India 
Pondicherry University, India 
Pondicherry University, India 
Pondicherry University, India 
Pondicherry University, India 
Pondicherry University, India 
Pondicherry University, India 
Pondicherry University, India 


Ramakrishna Mission Vivekananda Educational 


and Research Institute, India 
INTDM Jabalpur, India 
IIT Ropar, India 


BITS Pilani K K Birla Goa Campus, India 


Marko Jakovac 
Jesper Jansson 

Ce Jin 

Sreejith K. P. 
Anjeneya Swami Kare 
Niraj Khare 
Mirostaw Kowaluk 
Christos Levcopoulos 
Vincenzo Liberatore 
Tian Liu 

Raghunath Reddy M. 
Atrayee Majumder 


Tapas Kumar Mishra 
Shuichi Miyazawaki 
Fahad Panolan 

Pablo Perez-Lantero 
Veena Prabhakaran 
Sadagopan N. 

Francis P. 

Venkata Subba Reddy P. 
Sajith Padinhatteeri 
Narad Rampersad 

M. V. Panduranga Rao 


Organization 


Prakash Saivasan 
Brahadeesh Sankarnarayanan 
Ildiké Schlotter 

Elzbieta Tumidajewicz 

Karol Wegrzycki 

Mariusz Wozniak 

Hyeyun Yang 

Ismael Gonzalez Yero 

Jingru Zhang 

Pawel Zylinski 


Abstracts of Invited Talks 


All-Pairs Shortest Paths and Fine-Grained Complexity 


Timothy M. Chan 


University of Illinois at Urbana-Champaign, Urbana, USA 
tmc@illinois.edu 


Abstract. The all-pairs shortest paths (APSP) problem is one of the 
most fundamental problems in algorithm design and fine-grained 
complexity. The problem for general weighted dense graphs is 
conjectured to require close to n° time. On the other hand, substantially 
subcubic algorithms are known in some important special cases via 
fast matrix multiplication; for example, for directed graphs that are 
unweighted (or have small integer weights), the current best algorithm 
due to Zwick (FOCS 1998) had running time near n?> if the matrix 
multiplication exponent w is equal to 2. 

In this talk, I will survey the current landscape surrounding the 
complexity of APSP and its variants, and how the conjectured hardness of 
APSP in the general and unweighted cases have been used as the basis for 
establishing conditional lower bounds for other problems. In particular, 
I will describe recent joint work with Virginia Vassilevska Williams and 
Yinzhan Xu (ICALP 2021), showing that Zwick’s algorithm is in some 
sense optimal for directed unweighted graphs. 


Linear Programming and its Uses in Algorithm Design 


Daya R. Gaur 


University of Lethbridge, Lethbridge, Canada 
gaur@cs.uleth.ca 


Abstract. Linear programming has a rich history. In this talk, we focus 
on its use in algorithm design. We will look at its use in three areas. 
The first is the design of exact algorithms, and the second is the design 
of approximate algorithms. Thirdly, its use in the creation of practical 
algorithms for computationally challenging problems. I will give several 
examples of how researchers in my group use linear programming to 
develop exact and approximate algorithms. These illustrative examples 
will also highlight the computational challenges still remaining. Most of 
the theory that will be covered is explained nicely in these books [Dantzig 
and Thapa, 2006; Vazirani, 2003; Lau et al., 2011; Cook et al., 1998]. This 
talk will be a little tour of the strengths of linear programming and how 
to use them. This introductory talk will be a mix of theory and practice 
and no background is assumed. 


Approximation Algorithms for Some Geometric 
Optimization Problems 


Joseph S. B. Mitchell 


Stony Brook University, Stony Brook, USA 
joseph.mitchell@stonybrook.edu 


Abstract. We discuss approximation algorithms for some instances of 
geometric optimization problems, including maximum independent set, 
dominating set, vehicle routing, and set cover. In all cases the problems 
are specified by geometric data, such as points, rectangles, polygons, 
and disks, and the results strongly exploit geometry to yield better 
results than can be achieved (or at least better than results known so 
far) in non-geometric settings. We are motivated by applications 
of computational geometry in sensor networks and mobile robotics, 
including classic problems on “art galleries” that need to be guarded 
by static or mobile guards within a polygonal domain. Almost all of 
these optimization problems are NP-hard even in simple two-dimensional 
settings. The problems get even harder when we take into account 
uncertain data, time constraints for scheduled coverage, and rout- 
ing/connectivity problems in combination with coverage constraints. 
We discuss selected versions of these geometric optimization problems 
from the perspective of approximation algorithms and we describe some 
techniques that have led to new or improved approximation bounds for 
certain maximum independent set and routing/coverage problems. 
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A Proof of the Multiplicative 1-2-3 
Conjecture 


Julien Bensmail!(), Hervé Hocquard?, Dimitri Lajou2, and Eric Sopena? 


' Université Céte d’Azur, CNRS, Inria, 138, Biot, France 
2 Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, 
UMR 5800, 33400 Talence, France 


Abstract. We prove that the product version of the 1-2-3 Conjecture, 
raised by Skowronek-Kazidéw in 2012, is true. Namely, for every connected 
graph with order at least 3, we can assign labels 1,2,3 to the edges so 
that no two adjacent vertices are incident to the same product of labels. 


Keywords: 1-2-3 Conjecture - Product version - Labels 1, 2, 3 


1 Introduction 


Let G be a graph. A k-labelling €: E(G) — {1,...,k} is an assignment of 
labels 1,...,& to the edges of G. From @, we can compute different parameters 
of interest for all vertices v, such as the sum o¢(v) of incident labels (being 
formally o¢(v) = Yyen(vyl(uv)), or similarly the multiset up(v) of labels incident 
to v or the product pe(v) of labels incident to v. We say that @ is s-proper if o¢ 
is a proper vertex-colouring of G, i.e., we have o¢(u) # oe(v) for every edge 
uv € E(G). Similarly, we say that ¢ is m-proper and p-proper, if ue and pe, 
respectively, form proper vertex-colourings of G. 

In the context of so-called distinguishing labellings, the goal is generally to 
not only distinguish vertices within some distance according to some parameter 
computed from labellings (such as the parameters oy, fue and py above, to name 
a few), but also to construct such k-labellings with & as small as possible. We 
refer the interested reader to [4], which lists hundreds of labelling techniques. 

Regarding s-proper, m-proper and p-proper labellings, which are the main 
focus in this work, we are thus interested, as mentioned above, in finding such k- 
labellings with & as small as possible, for a given graph G. In other words, we are 
interested in the parameters ys(G), xm(G) and xp(G) which denote the smallest 
k > 1 such that s-proper, m-proper and p-proper, respectively, k-labellings exist 
(if any). Actually, through greedy labelling arguments, it can be observed that 
the only connected graph G for which yg(G), xm(G) or xp(G) is not defined, is 
Ko, the complete graph on 2 vertices. Consequently, these three parameters are 


Some proofs in this paper are voluntarily omitted due to space limitation; the interested 
reader will find them in [3], the full version of the current paper. This work is partially 
supported by the ANR project HOSIGRA (ANR-17-CE40-0022). 

© Springer Nature Switzerland AG 2022 
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generally investigated for so-called nice graphs, which are those graphs with no 
connected component isomorphic to Ko. 

S-proper, m-proper and p-proper labellings form a subfield of distinguish- 
ing labellings, which has been attracting attention due to the so-called 1-2-3 
Conjecture, raised, in [6], by Karoriski, Luczak and Thomason in 2004: 


1-2-3 Conjecture (sum version). If G is a nice graph, then ys(G) < 3. 


Later on, counterparts of the 1-2-3 Conjecture were raised for m-proper and 
p-proper labellings. Addario-Berry et al. first raised, in 2005, the following in [1]: 


1-2-3 Conjecture (multiset version). If G is a nice graph, then xm(G) < 3. 
while Skowronek-Kazidéw then raised, in 2012, the following in [8]: 
1-2-3 Conjecture (product version). If G is a nice graph, then xp(G) < 3. 


It is worth mentioning that all three conjectures above, if true, would be 
tight, as attested for instance by complete graphs. Note also that the multiset 
version of the 1-2-3 Conjecture is, out of the three variants, the easiest one in a 
sense, as every s-proper or p-proper labelling is also m-proper (thus, proving the 
sum or product variant of the 1-2-3 Conjecture would prove the multiset one). 

To date, the best result towards the sum version of the 1-2-3 Conjecture, 
proved by Kalkowski, Karoriski and Pfender in [5], is that ys(G) < 5 holds for 
every nice graph G. Another significant result is due to Przybylto, who recently 
proved in [7] that even ys(G) < 4 holds for every nice regular graph G. Karoriski, 
Luczak and Thomason themselves also proved in [6] that yg(G) < 3 holds for 
nice 3-colourable graphs. Regarding the multiset version, for long the best result 
was the one proved by Addario-Berry, Aldred, Dalal and Reed in [1], stating that 
xom(G) < 4 holds for every nice graph G. Building on that result, Skowronek- 
Kaziow later proved in [8] that yp(G) < 4 holds for every nice graph G. She also 
proved that yp(G) < 3 holds for every nice 3-colourable graph G. 

A breakthrough result was recently obtained by Vuékovi¢, as he totally 
proved the multiset version of the 1-2-3 Conjecture in [9]. Due to connections 
between m-proper and p-proper 3-labellings, we observed in [2] that this result 
directly implies that yp(G) < 3 holds for every nice regular graph G. Inspired 
by Vuékovié’s proof scheme, we were also able to prove that yp(G) < 3 holds 
for nice 4-colourable graphs G, and to prove related results that are very close 
to what is stated in the product version of the 1-2-3 Conjecture. 

Building on these results, we prove the following throughout this paper. 


Theorem 1. The product version of the 1-2-3 Conjecture is true. That is, every 


nice graph admits p-proper 3-labellings. 


2 Proof of Theorem 1 


Let us start by introducing some terminology and recalling some properties of 
p-proper labellings, which will be used throughout the proof. Let G be a graph, 
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and ¢ be a 3-labelling of G. For a vertex v € V(G) and a label i € {1,2,3}, we 
denote by d;(v) the i-degree of v by @, being the number of edges incident to v 
that are assigned label i by @. Note then that p(v) = 2”)3%), We say that v 
is 1-monochromatic if dg(v) = d3(v) = 0, while we say that v is 2-monochromatic 
(3-monochromatic, resp.) if do(v) > 0 and d3(v) = 0 (d3(v) > 0 and d2(v) = 0, 
resp.). In case v has both 2-degree and 3-degree at least 1, we say that v is 
bichromatic. We also define the {2,3}-degree of v as the sum d2(v) + d3(v) of its 
2-degree and 3-degree. If v is bichromatic, then its {2,3}-degree is at least 2. 

Because ¢ assigns labels 1,2,3, and, in particular, because 2 and 3 are 
coprime, note that, for every edge uv of G, we have pe(u) # pe(v) when u 
and v have different 2-degrees, 3-degrees, or {2,3}-degrees. In particular, u and 
v cannot be in conflict, 7.e., satisfy pe(u) = pe(v), if u and v are i-monochromatic 
and j-monochromatic for 1 4 j, or if u is monochromatic while v is bichromatic. 

Before going into the proof of Theorem 1, let us start by giving an overview 
of it. Let G be a nice graph. Our goal is to build a p-proper 3-labelling @ of G. 
We can clearly assume that G is connected. We also set t = x(G), where, recall, 
x(G) refers to the chromatic number! of G. In particular, t > 2. 

In what follows, we construct @ through three main steps. First, we need to 
partition the vertices of G in a way satisfying specific cut properties, forming 
what we call a valid partition of V(G) (see later Definition 1 for a more formal 
definition). In short, a valid partition V = (Vi,..., Vi) is a partition of V(G) into 
t independent sets Vj,...,V; fulfilling two main properties, being, roughly put, 
that 1) every vertex v in some part V; with 7 > 1 has an incident upward edge 
to every part V; with j <i, and 2) for every connected component of G[V; U V2] 
having only one edge, we can freely swap its two vertices in Vj and V2 while 
preserving the main properties of a valid partition. 

Once we have this valid partition V in hand, we can then start constructing 
£. The main part of the labelling process, Step 2 below, consists in starting from 
all edges of G being assigned label 1 by @, and then processing the vertices of 
V3,...,V; one after another, possibly changing the labels by ¢ assigned to some 
of their incident edges, so that certain product types are achieved by pe. These 
desired product types can be achieved due to the many upward edges that some 
vertices are incident to (in particular, the deeper a vertex lies in V, the more 
upward edges it is incident to). The product types we achieve for the vertices 
depend on the part V; of V they belong to. In particular, the modifications we 
make on @ guarantee that all vertices in V3,...,V; are bichromatic, every two 
vertices in V; and V; with i,j € {3,...,t} and i # 7 have different 2-degrees 
or 3-degrees, all vertices in V2 are 1-monochromatic or 2-monochromatic, and 
all vertices in V; are 1-monochromatic or 3-monochromatic. By itself, achieving 
these product types makes £ almost p-proper, in the sense that the only possible 
conflicts are between 1-monochromatic vertices in V; and V2. An important point 
also, is that, through these label modifications, we will make sure that all edges 


' Recall that a proper k-vertex-colouring of a graph G is a partition (Vi,...,Ve) of 
V(G) where all V;’s are independent. The chromatic number x(G) of G is the smallest 
k > 1 such that proper k-vertex-colourings of G exist. G is k-colourable if x(G) < k. 
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of G[V; UVa] remain assigned label 1, and no vertex in V3U---UV; has 3-degree 1, 
2-degree at least 2, and odd {2,3}-degree; in last Step 3 below, we will use that 
last fact to remove remaining conflicts by allowing some vertices of Vj U V2 to 
become special, i.e., make such vertices v satisfy d3(v) = 1, do(v) > 2, and 
dg(v) + ds(v) = 1 mod 2, while making sure that the products of the vertices in 
V3U---UV; are not altered. 

Step 3 is designed to get rid of the last conflicts between the adjacent 1- 
monochromatic vertices of V,; and V2 without introducing new ones in G. To 
that end, we will consider the set H of the connected components of G[V; U V9] 
having conflicting vertices, and, if needed, modify the labels assigned by ¢ to 
some of their incident edges so that no conflicts remain, and no new conflicts are 
created in G. To make sure that no new conflicts are created between vertices 
in V, UV, and vertices in V3 U--- UV, we will modify labels while making sure 
that all vertices in Vj U V2 are monochromatic or special. An important point 
also, is that the fixing procedures we introduce require the number of edges in a 
connected component of 1 to be at least 2. Because of that, once Step 2 ends, 
we must ensure that 7 does not contain a connected component with only one 
edge incident to two 1-monochromatic vertices. To guarantee this, we will also 
make sure, during Step 2, to modify labels and the partition V slightly so that 
H has no such configuration. 


Step 1: Constructing a valid partition 


Let V = (Vi,..., Vi) be a partition of V(G) where each V; is an independent set. 
Note that such a partition exists, as, for instance, any proper t-vertex-colouring 
of G forms such a partition of V(G). For every vertex u € Vj, an incident upward 
edge (downward edge, resp.) is an edge wv for which v belongs to some V; with 
j<i(j >i, resp.). Note that all vertices in V; have no incident upward edges, 
while all vertices in V; have no incident downward edges. 

We denote by Mo(V) (also denoted Mp when the context is clear) the set of 
isolated edges in the subgraph G[V; U Va] of G induced by the vertices of Vi UVa. 
That is, Mp contains the edges of the connected components of G[V; U Vo] that 
consist in one edge only. To lighten the exposition, whenever referring to the 
vertices of Mop, we mean the vertices of G incident to the edges in Mo. 

For an edge uv € Moy with u € V, and v © Vo, swapping uv consists in 
modifying the partition V by removing u from V; (v from Va, resp.) and adding 
it to V2 (Vi, resp.). In other words, we exchange the parts to which wu and 
v belong. Note that if V; and V2 are independent sets before the swap, then, 
because wv € Mo, by definition the resulting new V; and V2 remain independent. 
Also, the set Mo is unchanged by the swap operation. 

We can now give a formal definition for the notion of valid partition. 


Definition 1 (Valid partition). For a t-colourable graph G, a partition V = 
(Vi,..-,Vi) of V(G) ts valid (for G) if V satisfies the following properties. 


(Z) Every V; is an independent set. 
(P\) Every verter in some V; with i > 2 has a neighbour in V; for every j <i. 
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(S) For every sequence (e;); of edges of Mo(V), successively swapping every e; 
(in any order) results in a partition V’ satisfying Properties (I) and (P1). 


Note that Property (S) implies the following property: 
(P2) Swapping any number of edges of Mo(V) results in a valid partition V’. 


To prove Theorem 1, as mentioned earlier, to start constructing @ we need to 
have a valid partition of G in hand. The following result guarantees its existence. 


Lemma 1. Every nice t-colourable graph G admits a valid partition. 


Proof. For a partition V = (Vi,...,Vi) of V(G) where each V; is independent 
(such a partition exists, as attested by any proper ¢-vertex-colouring of G), set 
f(V) = X_ k- |Vi|. Among all possible V’s, consider a V that minimises f(V). 

Suppose that there is a vertex u € V; with 7 > 2 for which Property (P:) 
does not hold, 7.e., there is a 7 < 7 such that u has no incident upward edge 
to V;. By moving u to V;, we obtain another partition V’ of V(G) where every 
part is an independent set. However, note that f(V’) = f(V)+j—-i< f(V),a 
contradiction to the minimality of V. From this, we deduce that every partition 
VY minimising f must satisfy Property (P,). 

Let now V’ be the partition of V(G) obtained by successively swapping edges 
of Mo(V). Recall that the swapping operation preserves Property (Z) and observe 
that f(V) = f(V’). Hence, V’ minimises f and thus satisfies Properties (Z) 
and (P;). Thus Property (S) also holds, and Y is a valid partition of G. 


From here, we assume that we have a valid partition V = (Vi,...,Vi) of G. 


Step 2: Labelling the upward edges of V3,..., V; 


From G and Y, our goal now is to construct a 3-labelling @ of G achieving certain 
properties, the most important of which being that the only possible conflicts 
are between pairs of vertices of Vj and V2 that do not form an edge of Mp. The 
following result sums up the exact conditions we want @ to fulfil. Recall that a 
vertex v is special by @, if d3(v) = 1, do(v) > 2 and dg(v) + d3(v) is odd. Note 
that special vertices are bichromatic. 


Lemma 2. For every nice graph G and every valid partition (Vi,...,Vi) of G, 
there exists a 3-labelling € of G such that: 


all vertices of Vi are either 1-monochromatic or 3-monochromatic, 

all vertices of V2 are either 1-monochromatic or 2-monochromatic, 

all vertices of V3U---UV; are bichromatic, 

no vertex is special, 

ifu eV, and v € Vo are adjacent, then (uv) = 1, 

if two vertices u and v are in conflict, then u € V, and uv € V2 (or vice versa), 
and at least one of u or v has a neighbour w in Vi U V2. 


AAs ww 
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Proof. From now on, we fix the valid partition V = (Vi,...,V;) of G. During 
the construction of £, we may have, however, to swap some edges of Mo, result- 
ing in a different valid partition of G. Abusing the notations, for simplicity we 
will still denote by Y any valid partition of G obtained this way, through swap- 
ping edges. Recall that valid partitions are closed under swapping edges of Mo 
(Property (P2) of Definition 1). 

Our goal is to design @ so that it not only satisfies the four colour properties 
of Items 1 to 4 of the statement, but also achieves the following refined product 
types, for every vertex v in a part V; of V: 


— v€V,: v is 1-monochromatic or 3-monochromatic; 
— v € Va: v is 1-monochromatic or 2-monochromatic; 
— v € Vs: v is bichromatic with 2-degree 1 and even {2,3}-degree; 
— v € V4: v is bichromatic with 3-degree 2 and odd {2,3}-degree; 
— v € Vs: v is bichromatic with 2-degree 2 and even {2,3}-degree; 


— v € Von, n > 3: v is bichromatic with 3-degree n and odd {2,3}-degree; 
— v € Von41, 2 > 3: v is bichromatic with 2-degree n and even {2,3}-degree; 


We start from @ assigning label 1 to all edges of G. Let us now describe how to 
modify @ so that the conditions above are met for all vertices. We consider the 
vertices of V;,...,V3 following that order, from “bottom to top”, and modify 
labels assigned to upward edges. An important condition we will maintain, is 
that every vertex in an odd part Van41 (nm > 0) has all its incident downward 
edges (if any) labelled 3 or 1, while every vertex in an even part V2, (n > 1) has 
all its incident downward edges (if any) labelled 2 or 1. Note that this is trivially 
satisfied for the vertices in Vi, since they have no incident downward edges. 

At any point in the process, let M be the set of edges of Mp for which 
both ends are 1-monochromatic (initially, / = Mo). When treating a vertex 
u € V3U---UV;, we define M,, as the subset of edges of MW having an end that 
is a neighbour of u. For every edge e € M,,, we choose one end of e that is a 
neighbour of wu and we add it to a set S,,. Note that |.S,,| = |M,|. Another goal 
during the labelling process, to fulfil Item 6, is to label the edges incident to 
u so that at least one end of every edge in M,, is no longer 1-monochromatic. 
Note that the set 1, considered when labelling the edges incident to u is not 
necessarily the set of edges of Mo incident to a neighbour of u, as, during the 
whole process, some of these edges might be removed from M when dealing with 
previous vertices in V3 U---U V}. 

Let us now consider the vertices in V;,..., V3 one by one, following that order. 
Let thus u € V; be a vertex that has not been treated yet, with i > 3. Recall 
that every vertex belonging to some V; with 7 > i was treated earlier on, and 
thus has its desired product. Suppose that 7 = 2n with n > 2 (¢ = 2n4+ 1 with 
n > 1, resp.). Recall also that u is assumed to have all its incident downward 
edges labelled 1 or 2 (3, resp.), due to how vertices in V;’s with j > i have been 
treated earlier on, and to have all its incident upward edges labelled 1. 
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If M, 4 0, then we swap edges of M,, if necessary, so that every vertex in 
S,, belongs to V2 (Vi, resp.). This does not invalidate any of our invariants since 
both ends of an edge in S,, are 1-monochromatic. 

In any case, by Property (P1), we know that, for every j < i, there is a vertex 
x; € V; which is a neighbour of u. In particular, the vertex x; (x2, resp.) does 
not belong to S,, (but may be the other end of an edge in M,,). We label the 
edges uX3,U@5,..-,UL2n-1 With 3 (ux4,ure,...,UX2n with 2, resp.). Note that, 
at this point, d3(u) = n — 1 (do(u) =n-—1, resp.). To finish dealing with u, we 
need to distinguish two cases depending on whether M,, is empty or not. 


— Suppose first that M,, = 0. Label ux, with 3 (ux with 2, resp.). Now u has 
the desired 3-degree (2-degree, resp.). If ¢ > 3, then label ux;_2 with 2 (3, 
resp.) so that u is sure to be bichromatic. If i > 3 and the {2,3}-degree of u 
does not have the desired parity, then label waz with 2 (ua, with 3, resp.). 
If wu € V3 and the {2,3}-degree of u is even, then wu is already bichromatic 
since dg(u) = 1. If u € V3 and the {2,3}-degree of u is odd, then label ua, 
with 3 to adjust the parity of the {2,3}-degree of u and make u bichromatic. 
In all cases, u gets bichromatic with 3-degree n (2-degree n, resp.) and odd 
{2,3}-degree (even {2,3}-degree, resp.), which is what is desired for wu. 

— Suppose now that M, 4 @. Let z € S,, and let e be the edge of M,, containing 
z. For every w € Si, \ {z}, we label the edge ww with 2 (3, resp.). Then: 

e If do(u) + d3(u) is odd (even, resp.), then label uz with 2 (3, resp.) and 

ux, with 3 (uae with 2, resp.). In this case, every edge in M,, is incident to 
at least one vertex which is not 1-monochromatic, while u is bichromatic 
with 3-degree n (2-degree n, resp.) and odd {2,3}-degree (even {2,3}- 
degree, resp.). 
If do(u) + d3(u) is even (odd, resp.) and d2(u) > 0 (d3(u) > 0, resp.), 
then swap e and label uz with 3 (2, resp.). Note that, after the swap 
of e, we have z € V; (z © Va, resp.). In this case, every edge in M,, is 
incident to at least one vertex which is not 1-monochromatic, while u 
is bichromatic with 3-degree n (2-degree n, resp.) and odd {2,3}-degree 
(even {2,3}-degree, resp.). 
The last case is when d2(u) + d3(u) is even (odd, resp.) and do(u) = 0 
(d3(u) = 0, resp.). If i > 4, then we can label ux;_2 with 2 (3, resp.) 
and fall back into one of the previous cases. If i = 4, then the only 
edge labelled 3 is the edge ux3 which implies that d3(u) = 1, which is 
impossible since d2(u) = 0 and d2(u) + d3(u) is odd. If 4 = 3, then the 
conditions of this case imply that d2(w) = 1 while every upward edge 
incident to u is labelled 1 or 3 and similarly for every incident downward 
edge; this case thus cannot occur. 

To finish, we remove the edges of M,, from M since their two ends are not 

both 1-monochromatic any more. 


At the end of this process, all vertices in V; are 1-monochromatic or 3- 
monochromatic, while all vertices in V2 are 1-monochromatic or 2-monochromatic. 
Every vertex in V3 U---UV; is bichromatic and there are no conflicts involving any 
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pair of these vertices. Indeed ifa € Vi and b € V; are adjacent with 7 > 7 > 3, then 
either 7 and j do not have the same parity, in which case a and b do not have the 
same {2, 3}-degree; or bothi and j are even (odd, resp.) and d3(a) = 4 # Z = d3(b) 
(d:(a) = + it = dy2(b), resp.). Note also that no vertex in G is special, as 
special vertices have 3-degree 1, 2-degree at least 2, and odd {2,3}-degree. Also, 
we did not relabel any edge in the cut (Vi, V2). 

Finally, suppose that there is a conflict between two vertices u and v. Previous 
remarks imply that u € Vi and v € V9 (or vice versa) and that both u and v 
are 1-monochromatic. If none of wu and v has another neighbour w in V, U V3, 
then the edge uv belongs to the set Mo. Since G is nice, one of u or v must have 
a neighbour z in V3 U---UV;. Hence uv € M,. Recall also that we relabelled 
the edges incident to z in such a way that, for every edge of M,, at least one 
incident vertex became 2-monochromatic or 3-monochromatic, a contradiction 
to the existence of u and v. Hence, all properties of the lemma hold. 


Step 3: Labelling the edges between V, and V2 


From now on, we will modify a 3-labelling @ of G obtained by applying Lemma 2. 
We denote by H the set of the connected components of G[Vi U V2] that contain 
two adjacent vertices u € V, and v € V2 having the same product by ¢. By 
Items 1 and 2 of Lemma 2, such u and v are 1-monochromatic. Also, by Item 6 
of Lemma 2, recall that every connected component of 1 has at least two edges. 
In what follows, we only relabel edges of some connected components H € H 
with making sure that their vertices (in Vj U V2) are monochromatic or special. 
This ensures that only vertices of H have their product affected, thus that no 
new conflicts involving vertices in V3 U---U V; are created. 

For a subgraph X of H € H (possibly X = H), if, after having relabelled 
edges of X, no conflict remains between vertices of X and all vertices of X are 
either monochromatic or special, then we say that X satisfies Property (P3). 


Lemma 3. /f we can relabel the edges of every H € H so that every H satisfies 
Property (P3), then the resulting 3-labelling is p-proper. 


Proof. This is because if we get rid of all conflicts in 7H, then the only possi- 
ble remaining conflicts are between vertices in Vi U V2 and in V3 U---UYV;. In 
particular, recall that any two vertices of two distinct connected components 
A), Hz € G[V; U Va] cannot be adjacent. Note also that, because we only rela- 
belled edges in H, the vertices in V3U---UV; retain the product types described 
in Lemma 2. In particular, they remain bichromatic and none of them is special. 
Thus, they cannot be in conflict with the vertices in Vj U V2. 


In order to show that we can relabel the edges of every H € H so that it 
fulfils Property (P3), the following result will be particularly handy. 


Lemma 4. For every integer s € {2,3}, every connected bipartite graph H 
whose edges are labelled 1 or s, and any verter v in any part V; € {Vi, Vo} 
of H, we can relabel the edges of H with 1 and s so that ds(u) is odd (even, 
resp.) for every u € V; \ {uv}, and ds(u) is even (odd, resp.) for every u € V3_;. 
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Proof. As long as H has a vertex u different from v that does not satisfy the 
desired condition, apply the following. Choose P any path from u to v, which 
exists by the connectedness of H. Now follow P from u to v, and change the 
labels of the traversed edges from 1 to s and vice versa. It can be noted that 
this alters the parity of the s-degrees of u and v, while this does not alter that 
parity for any of the other vertices of H. Thus, this makes u satisfy the desired 
condition, while the situation did not change for the other vertices different from 
u and v. Thus, once this process ends, all vertices of H different from v have 
their s-degree being as desired by the resulting labelling. 


We are now ready to treat the connected components H ©€ H independently, 
so that they all meet Property (P3). To ease the reading, we distinguish several 
cases depending on the types and on the degrees of the vertices that H includes. 
In each of the successive cases we consider, it is implicitly assumed that H does 
not meet the conditions of any previous case. 


Claim 1. Jf H ©€ H has a 3-monochromatic verter v € Vi, or a 1- 
monochromatic vertex v1 € Vi having two 1-monochromatic neighbours uz, U2 € 
V2 with degree 1 (in H), then we can relabel edges of H so that H satisfies 
Property (P3). 


Proof. Recall that all edges of H are assigned label 1; thus, if a vertex of H is 
3-monochromatic, then it must be due to incident downward edges to V3,..., Vz. 

If H has a 1-monochromatic vertex v; € V; that is adjacent to two degree- 
1 1-monochromatic vertices u1,u2 € V2, then we set &(viu1) = €(viu2) = 3. 
Note that u; and uz become 3-monochromatic with 3-degree 1, and are thus no 
longer in conflict with v;, as it becomes 3-monochromatic with 3-degree 2. Note 
that either we got rid of all conflicts in H and H now satisfies Property (P3) 
as desired, or conflicts between other 1-monochromatic vertices of H remain. In 
the latter case, we continue with the following arguments. 

Assume H has remaining conflicts, and that H has a 3-monochromatic vertex 
v € V; (and, due to the previous process, perhaps 3-monochromatic vertices u1 
and wuz in Vo, in which case their 3-degree (and degree in H) is precisely 1, while 
their unique neighbour v in V; 7 V(#) is 3-monochromatic with 3-degree 2). 
Let X be the set of all 3-monochromatic vertices of H belonging to Vj. Let 
C,...,Cq denote the g > 1 connected components of H — X that do not consist 
in a 3-monochromatic vertex of V2 (the vertices u; and uz we dealt with earlier 
on). For every C;, we choose arbitrarily a vertex x; € X and a vertex y; € C; 
such that x; and y; are adjacent in H. Note that the vertices of C; are either 
1-monochromatic or 2-monochromatic (in which case they belong to V2), since 
all 3-monochromatic vertices of H are part of X (or are the vertices ui and ug 
dealt with earlier on, which we have omitted and are not part of the C;’s). 

By Lemma 4, in every C; we can relabel the edges with 1 and 2 so that all 
vertices in (Vo 1 V(C;)) \ {y:} are 2-monochromatic with odd 2-degree, while 
all vertices in Vi 9 V(C;) are 2-monochromatic with even 2-degree or possibly 
1-monochromatic if their even 2-degree is 0. In particular, recall that y; must 
be 1-monochromatic or 2-monochromatic. If y; has odd 2-degree, then there are 
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no conflicts between vertices of C;. If y; has even non-zero 2-degree, then we set 
€(a;y;) = 3, thereby making y; special. 

Let Y be the set of all 1-monochromatic y;’s having a 1-monochromatic 
neighbour w; in C;. Let H’ be the subgraph of H induced by Y U X. Note 
that every edge of H’ is labelled 1. Let now Q1,...,@Q, denote the connected 
components of H’ and choose zx € XMV(Q,;) for every k € {1,...,p}. For every 
k, we apply Lemma 4 with labels 1 and 3 so that all vertices in V2 V(Q,) get 
3-monochromatic with odd 3-degree, while all vertices in V, NV(Q,) \ {xz} get 
3-monochromatic with even 3-degree or possibly 1-monochromatic (3-degree 0). 

If v, is involved in a conflict with a vertex y; € V2 V(Q,), then this is 
because x, has odd 3-degree. Then: 


— If &(xpy:) = 3, then d3(y;) = d3(az~) > 3 since a, € X (x, must thus be 
incident to at least one other edge labelled 3, either a downward edge to 
V3,...,Vz or an edge incident to u; (and similarly an edge incident to uz)). 
We here assign label 1 to the edge x,y; and label 3 to the edge y;w,;. This 
way, 2, gets even 3-degree while the 3-degree of y; does not change. Note 
that y; and w; are not in conflict since d3(w;) = 1 and d3(y;) > 3. 

— Otherwise, if €(a,y;) = 1, then we assign label 3 to the edge x,y; and label 3 
to the edge y;w;. This way, x, gets even 3-degree while the 3-degree of y; 
remains odd and must be at least 3. Again y; and w; are not in conflict since 
d3(w;) = 1 and d3(y;) 2 3. 


We claim that we got rid of all conflicts in H. Indeed, consider two adjacent 
vertices a € Vi NV(H) and b € V2 V(#). Suppose first that a and b belong to 
some C;. Note that, with the exception of y; and maybe of the vertex w;, (if it 
exists and y; € Y), every vertex of C; is 1:monochromatic or 2-monochromatic, 
the vertices of Vi 7 V(C;) having even 2-degree and the vertices of V2 7 V(C;) 
having odd 2-degree. Thus, no conflict involves two of these vertices. Suppose 
now that b = y;. If y; is 2-monochromatic with odd 2-degree, then there is no 
conflict involving y; in C; since all of its neighbours in C; have even 2-degree. 
If y; is special, then it is the only special vertex of C;, so, here again, it cannot 
be involved in a conflict. If y; ¢ Y and y; is 1-monochromatic, then y; has no 
other 1-monochromatic neighbour in C; by definition of Y. If y; € Y, then y; is 
3-monochromatic with odd 3-degree, the only other possible 3-monochromatic 
neighbour of y; in C; being w;, but we showed previously that their 3-degrees 
differ. Thus, in all cases, there cannot be conflicts between vertices of C;. 

We are left with the case where a and b do not belong to the same C;. In 
particular, this implies that a € X and that a is 3-monochromatic. The only 
possible 3-monochromatic vertices in V2 are the vertices of Y, which have odd 3- 
degree, and the 3-monochromatic vertices u; and ug with 3-degree 1 and degree 1 
in H which might have been created at the very beginning of the proof. If b € Y, 
then, due to the application of Lemma 4 above, the only vertex of X which can 
have odd 3-degree is some x, but for this vertex we either ensured that it was 
involved in no conflict, or we tweaked the labelling so that it got even 3-degree 
without modifying the labelling properties obtained through Lemma 4. If 6 is u1 
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or ug, then b has only one neighbour v. Note that the edges vu1 and vuzg are still 
labelled 3 as they are not part of the Q,’s, and, thus, d3(b) = 1 and d3(v) > 2. 
Hence, there is no conflict between vertices of X and other vertices of H. This 
implies that H satisfies Property (P3). 


We can thus assume that H does not meet any of the conditions in Claim 1. 
The next step is showing that we can treat H in a similar way, in case H contains 
a 1-monochromatic vertex u € V2 with at least two neighbours in H. This can 
be proved similarly as Claim 1, by investigating the structure of H and making 
use of Lemma 4 to relabel edges of H in such a way that all remaining conflicts 
are located in very precise places of H (so that we can then handle them one 
by one). The formal proof being long, tedious, and in the same vein as that 
of Claim 1, due to space limitation we omit it from this paper. The interested 
reader will find the whole proof in [3], the full version of the current paper. 


Claim 2. If H has a1-monochromatic vertex u € Vz with at least two neighbours 
in H, then we can relabel edges of H so that H satisfies Property (P3). 


Assuming H does not meet any of the conditions in Claims 1 and 2, final 
arguments allow to relabel edges of H to get rid of all its conflicts. 


Claim 3. We can relabel edges of H so that it satisfies Property (P3). 


Proof. Let v € V,; and u € Vz be two adjacent 1-monochromatic vertices of H 
(which must exist as otherwise H would satisfy Property (P3)). Because H has 
at least two edges (as otherwise it would belong to M, not to H), at least one 
of v and u must have another neighbour in H. Since Claim 2 does not apply, 
note that wu must have degree 1 in H (since all neighbours of u in H must be 
1-monochromatic due to Claim 1 not applying). So v is also adjacent to k > 1 
vertices 41,...,¢%,% © Vo different from u, which must all be 2-monochromatic 
(because of incident downward edges to V3,..., Vi; recall that all edges of H are 
labelled 1) as otherwise Claim 2 would apply. 

Set H’ = H —u. According to Lemma 4, we can relabel edges in H’ with 1 
and 2 so that all vertices in (Vi N V(A’')) \ {uv} have odd 2-degree, while all 
vertices in V2 1 V(H’) have even 2-degree. Recall that u is 1-monochromatic. 
Thus, if also v is 2-monochromatic with odd 2-degree, then we are done. Assume 
thus that v is 2-monochromatic with even 2-degree. 


— Assume first that the 2-degree of v is even at least 2. In that case, set (vu) = 
3. This way, u becomes 3-monochromatic, while v becomes special. 

— Assume now v is l-monochromatic. This implies that ¢(v%1) = 1. Change 
€(vx1) to 3. This way, 7, becomes special (recall its 2-degree is even and at 
least 1, due to incident downward edges), while v becomes 3-monochromatic. 
Note that wu remains 1-monochromatic. 


In both cases, it can be checked that H now fulfils Property (P3). 


At this point, we dealt with all connected components of H, and the resulting 
labelling @ of G is p-proper by Lemma 3. The whole proof is thus complete. 
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Abstract. The class of 22-free graphs has been well studied in various 
contexts in the past. It is known that the class of {2K2,2K1 + Kp}-free 
graphs and {2K2, (Ki U K2)+ K>}-free graphs admits a linear y-binding 
function. In this paper, we study the classes of (P3U P2)-free graphs which 
is a superclass of 2K2-free graphs. We show that {P3 U P2,2K, + Kp}- 
free graphs and {P3 U P2, (Ki U K2) + K>p}-free graphs also admits a 
linear y-binding function. In addition, we give tight chromatic bounds 
for {P3U P2, HV N}-free graphs and { P3U P2, diamond}-free graphs, and 
it can be seen that the latter is an improvement of the existing bound 
given by A. P. Bharathi and S. A. Choudum [1]. 


Keywords: Chromatic number - x-binding function - (P3 U P2)-free 
graphs - Perfect graphs 
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1 Introduction 


All graphs considered in this paper are simple, finite and undirected. Let G be 
a graph with vertex set V(G) and edge set E(G). For any positive integer k, a 
proper k-coloring of a graph G is a mapping c: V(G) — {1,2,...,k} such that 
adjacent vertices receive distinct colors. If a graph G admits a proper k-coloring, 
then G is said to be k-colorable. The chromatic number, x(G), of a graph G is 
the smallest k such that G is k-colorable. Let P,,,C;, and K,, respectively denote 
the path, the cycle and the complete graph on n vertices. For S,T C V(G), let 
Nr(S) = N(S)NT (where N(S) denotes the set of all neighbors of S in G), let 
(S) denote the subgraph induced by S' in G and let [5,7] denote the set of all 
edges with one end in S and the other end in T.. If every vertex in S' is adjacent 
with every vertex in T’, then [S,7] is said to be complete. For any graph G, let 
G denote the complement of G. 
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Let F be a family of graphs. We say that G is ¥-free if it does not contain 
any induced subgraph which is isomorphic to a graph in F. For a fixed graph H, 
let us denote the family of H-free graphs by G(H). For any two disjoint graphs 
G, and G2 let G; U G2 and G; + G2 denote the union and the join of G; and 
G2 respectively. Let w(G) and a(G) denote the clique number and independence 
number of a graph G respectively. When there is no ambiguity, w(G) will be 
denoted by w. A graph G is said to be perfect if x(H) = w(H), for every induced 
subgraph H of G. 

In order to determine an upper bound for the chromatic number of a graph in 
terms of their clique number, the concept of y-binding functions was introduced 
by A. Gyarfads in [4]. A class G of graphs is said to be y-bounded [4] if there is 
a function f (called a y-binding function) such that y(G) < f(w(G)), for every 
G&G. We say that the y-binding function f is special linear if f(x) = «+c, 
where c is a constant. 

The family of 2K-free graphs has been well studied. A. Gyarfas in [4] posed 
a problem which asks for the order of magnitude of the smallest .-binding func- 
tion for G(2K2). In this direction, T. Karthick et al. in [5] proved that the 
families of {2K , H}-free graphs, where H € {HV N, diamond, Kk, + Ps, ky, + 
C4, Ps, P2 U P3, Ks — e} admit a special linear y-binding functions. The bounds 
for {2K2, K5—e}+free graphs and {2K2, K,+C4}-free graphs were later improved 
by Athmakoori Prashant et al., in [7,8]. In [2], C. Brause et al., improved the 
y-binding function for {2K2, Kk; + Py}-free graphs to max{3,w(G)}. Also they 
proved that for s £ 1 or w(G) F 2, the class of {2K2, (K,UK2)+K,}-free graphs 
with w(G) > 2s is perfect and for r > 1, the class of {2K ,2K,+K,}-free graphs 
with w(G) > 2r is perfect. Clearly when s = 2 andr = 3, (KyUK2)+K, = HVN 
and 2K,+K, = Ks —e which implies that the class of {2K2, HV N}-free graphs 
and {2K>, K5—e}-free graphs are perfect for w(G) > 4 and w(G) > 6 respectively 
which improved the bounds given in [5]. 

Motivated by C. Brause et al., and their work on 2K -free graphs in [2], 
we started looking at (P3 U P2)-free graphs which is a superclass of 2K -free 
graphs. In [1], A. P. Bharathi et al., obtained a O(w?) upper bound for the 
chromatic number of (P3U.P2)-free graphs and obtained sharper bounds for { P3U 
P2, diamond})-free graphs. In this paper, we obtain linear y-binding functions 
for the class of {P3 U P2,(K, U Ke) + K,}-free graphs and {.P3 U P2,2K, + 
K,}-free graphs. In addition, for w(G) > 3p — 1, we show that the class of 
{.P3U Po, (Ky UK2)+K,}-free graphs admits a special linear y-binding function 
f(z) =w(G)+p—1 and the class of {P3U P2, 2K, + K,}-free graphs are perfect. 
In addition, we give a tight y-binding function for {P3U P2, HV N}-free graphs 
and {P3 U P2,diamond}-free graphs. This bound for {P3 U P2,diamond}-free 
graphs turns out to be an improvement of the existing bound obtained by A. P. 
Bharathi et al., in [1]. 

Some graphs that are considered as forbidden induced subgraphs in this 
paper are given in Fig.1. Notations and terminologies not mentioned here are 
as in [10]. 
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<> 


paw diamond HVN 


Fig. 1. Some special graphs 


2 Preliminaries 


Throughout this paper, we use a particular partition of the vertex set of a 
graph G as defined initially by S. Wagon in [9] and later improved by A. 
P. Bharathi et al., in [1] as follows. Let A = {v1,v2,...,uu} be a maxi- 
mum clique of G. Let us define the lexicographic ordering on the set L = 
{(4,j) : 1 < i < gy < w} in the following way. For two distinct elements 
(41, 91), (i2, j2) € L, we say that (71,71) precedes (%2, j2), denoted by (i1, 91) <r 
(t2,j2) if either 7; < ig or 4; = ig and jy < jo. For every (i,j) € L, let 
Cig — {vu E V(G)\A :U ¢ N(v;) U N(v;)}\ Gg re (i) Cus}, Note that, for 
vg <1 

any k € {1,2,...,7}\{t, 7}, [uz, Ci,j] is complete. Hence w((Ci,;)) < w(G)—-j +2. 

For 1 <k <w, let us define , = {uv € V(G)\A: v € N(y), for every i € 
{1,2,...,w}\{k}}. Since A is a maximum clique, for 1 < k <w, J; is an inde- 
pendent set and for any « € I,, cu, ¢ E(G). Clearly, each vertex in V(G)\A 
is non-adjacent to at least one vertex in A. Hence those vertices will be con- 
tained either in I, for some k € {1,2,...,w}, or in Ci; for some (i,7) € L. 


Thus V(G) = AU (0,4) U 
V(G) = Vi UVa, where Vi = ‘ 


U Cy; }. Sometimes, we use the partition 
(4,j)EL 


U ({vz} U Tx) = U U,andvz2= U Cy 5. 
<k<w 1<k<w (i,jyeL 


Let us recall a result on (P3U P»)-free graphs given by A. P. Bharathi et al., 
in [1]. 


Theorem 1 [1]. [fa graph G is (P3U P2)-free, then x(G) < ENG) W(G) 2) 


Without much difficulty one can make the following observations on (P3 U P2)- 
free graphs. 


Fact 2. Let G be a (P3U P2)-free graph. For (i,7) € L, the following holds. 


(i) Each C;,; is a disjoint union of cliques, that is, (C;,;) is P3-free. 
(ii) For every integer s € {1,2,...,7}\{i,j}, N(vs) 2 {C,gUAUCU Ie)}\fvsU 
I,}. 
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3 {P3 U Po, (Ki U Kz) + K,}-free graphs 


Let us start Sect.3 with some observations on ((A, U K2) + K,)-free graphs. 


Proposition 1. Let G be a ((K, U K2) + K,)-free graph with w(G) > p+ 2, 
p>. Then G satisfies the following. 


(i) For k,@ € {1,2,...,w(G)}, [Ip, Le] is complete. Thus, (Vi) is a complete 
multipartite graph with U;, = {up} U Ip, 1 < k < w(G) as its partitions. 
(ii) Forj >p+2and1<i<j, Ci; =9. 
(iti) For « € Vo, x has neighbors in at most (p — 1) Ue’s where € € 
{1,2,...,w(G)}. 


Proposition 2. Let G be a {.P3U Po, (Ki U K2) + K,}-free graph with w(G) > 
p+2,p>1. Then G satisfies the following. 


(i) For (i,j) € L such that j < p+1, ifw((Ci,;)) > p—j+4, then w((Cxj)) <1 
fork #iand1l1<k<j-l. 


+1 4] 
(ii) If w(G) = (p+ 2+k), k > 0, then “i @ C3, is 
j=max{2,p+1—[|# |} \t=1 
P3-free. 


As a consequence of Proposition 1, we obtain Corollary 1 which is a result 
due to S. Olariu in [6]. 


Corollary 1 [6]. Let G be a connected graph. Then G is paw-free graph if and 
only if G is either K3-free or complete multipartite. 


Now, for p > 1 and w > max{3,3p — 1}, let us determine the structural 
characterization and the chromatic number of {P3 U P», (Ky U Ka) + K,}-free 
graphs. 


Theorem 3. Let p be a positive integer and G be a {P3U Po, (K, U Ke) + Kp}- 
free graph with V(G) = Vi UVe. If w(G) > max{3,3p — 1}, then (i) (Vi) ts 
a complete multipartite graph with partition U1, U2,...,U., (ii) (V2) is P3-free 
graph and (ti) x(G) < w(G) + p-1. 


Proof. Let p> 1 and G be a {P3U Po, (Ki UK2)+ K,}-free graph with w(G) > 
max{3, 3p—1}. By (i) of Proposition 1, we see that (V;) is a complete multipartite 
graph with the partition U, = {ug} U Ip, 1 < k < w. By (i) of Fact 2, each 
(C;,;) is P3-free, for every (i,7) € L. Also without much difficulty we can show 
that (V2) is P3-free. Now, let us exhibit an (w + p — 1)-coloring for G using 
{1,2,...,w+p-—1} colors. For 1 < k < w, give the color k to the vertices of Ux. 
Let H be a component in (V2). Clearly, each vertex in H is adjacent to at most 
p—1 colors given to the vertices of V; and is adjacent to w(H) — 1 vertices of 
H. Since w(H) < w(G), each vertex in H is adjacent to at most w(G) + p— 2 
colors. Hence, there is a color available for each vertex in H. Similarly, all the 
components of (V2) can be colored properly. Hence, y(G) < w(G) + p-—1. 
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Even though we are not able to show that the bound given in Theorem 3 is 
tight, we can observe that the upper bound cannot be made smaller than w + 
[25*1 by providing the following example. For p > 1, consider ue graph G* with 


V(G*) =X UY U Z, where X = o x;,Y = o yi and Z = ‘ z if p > 2 else 
Z = (and edge set E(G*) = {{2; toy: oe cme ice omens 


where 1 <i,j,mn<w,l<rsn<p—-1lif¢j,r#s and m Fn}. Without 
much difficulty, one can observe that G* is a {P3 U P2, (1 U Ke) + Kp}-free 


graph, w(G*) =w and a(G*) = 2. Hence, y(G*) > Hig3!| = w(G*) + [25+]. 
Next, we obtain a linear y-binding function for {P3U P2, (K, UK2)+K,}-free 


graphs with w > 3. 


Theorem 4. Let p be an integer greater than 1. If G is a {P3U Po, (Ky U Ke) + 
K,}-free graph, then 


w(G) + 33 (i -1)(p—-3 +3) for 3 <w(G)<ptl1 


p-L3 


x(G) < ) w(G) + 7(p— 1) 4 Pi (GG —1)(p—j +3) for w(G) = (p+2+k),0<k< 2-5 


rail 


j 
w(G) + 4p — 3 for w(G) = 3p — 2 
w(G)+p-1 for w(G) > 3p — 1. 


Proof. Let G be a {.P3U Po, (Ky U K2) + K,}-free graph with p > 2. For w(G) > 
3p — 1, the bound follows from Theorem 3. By (ii) of Proposition 1, we see 
that C;,; = @ for all 7 > p+ 2. We know that V(G) = V, U V2, where Vi = 

UW vet U Ie) and V2 = 4 ee Clearly, the vertices of V; can be colored 
with w(G) colors. Let us find an upper bound for x((V2)). First, let us consider 
the case when 3 < w(G) < p+1. For 1 <i < j < p+1, one can observe 
that w((Ci,;)) < w(G) — 7 +2 < p—j +3. Thus one can properly color the 


j—1 
vertices of (30 c) with at most (j — 1)(p — j + 3) colors and hence x(G) < 
pti ; 
x((Vi)) + x((V2)) < o(G) + D7 G - 1)(p- 7 +3). 
j= 
Next, let us consider w(G’) = (p+2+k), where 0 < k < 2p—4. By using (ii) of 


+1 = 
Proposition 2, sft (c Ci) is a P3-free graph. As in Theorem 3, 
j=p-|$]+1 \t=1 


p-L4) (j-1 
one can color the vertices of V(G)\ iG is c:,) with at most w(G)+p—1 
jJ= — 
colors. For k = 2p — 4, w(G) = 3p — 2 and we see that (V2\Cj,2) is P3-free and 


x((V(G)\C1,2)) < w(G)+p—1. Therefore, x(G) < x((V(G)\C1,2)) +x((C1,2)) = 
w(G) + 4p — 3. Finally, for 0 < k < 2p — 5, by using (i) of Proposition 2 and 
by ey: similar strategies but with a little more involvement, we can show 


that U (eu), where 2 < j < p—|4| can be properly colored using w(G) 
colors oie w((Ci;)) > p-—j+4, for some i € {1,2,...,7 — 1} or by using 
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(j —1)(»—7+8) colors when w((Ci,;)) < p—j+3, for every i € {1,2,...,7—1}. 


For 4 < j < p—|£], one can observe that (j-—1)(p—j+3) > w(G) and hence the 


> 


= j— 


p-l#] (9-1 lal 
verticesof | U ( U, C3, can be properly colored with S°> (j—1)(p—j+3) 
= t= 
colors. When j = 3, (j —1)(p— 7 +3) = 2p. Since 2p —5 > 0, p > 3. Also, since 
w(G) > 4, the vertices of Ci,2 and (Ci,3 UC2,3) can be properly colored with at 
p13) 
most (3p — 3) colors each. Hence, y(G) < (w(G) +p—1)+2(3p—3)+ YS (Gj- 
j=4 


p-L5) 
1)(p—j +3) =u(G) + 7(p—1)+ x G-lip—9 +9): 


The bound obtained in Theorem 4 is not optimal. This can be seen in The- 
orem 5. Note that when p = 2, (K, UK2)+ K, = HVN. 


Theorem 5. If G is a {P3U Pj, HV N}-free graph with w(G) > 4, then x(G) < 
w(G) +1. 


The graph G* (defined next to Theorem 3) shows that the bound given in 
Theorem 5 is tight. 


4 {P3U P2,2K,+ K,}-free graphs 


Let us start Sect. 4 by observing that any {P3U P2,2K,+ K,}-free graph is also 
a {P3 U Po, (ky U Ke) + K,}-free graph. Hence the properties established for 
{P3 U P32, (KK) U K2) + K,}-free graphs is also true for {P3U P2,2K, + K,}-free 
graphs. 

One can observe that by using techniques similar to the one’s used in Theorem 
3 and by Strong Perfect Graph Theorem [3], any {P3U P2,2K,+ K,}-free graph 
is perfect, when w > 3p— 1. 
Theorem 6. Let p be a positive integer and G be a {P3 U Po, 2K, + K,}-free 
graph with V(G) = Vi, UV). If wo(G) > 3p —1, then (Vi) is complete, (V2) is 
P3-free and G is perfect. 
As a consequence of Theorem 4 and Theorem 6, without much difficulty one can 
observe Proposition 3 and Corollary 2. 
Proposition 3. Let G be a {P3 U P2,2K, + K,}-free graph. If w((C1,2)) > 2p, 
then x(G) = w(G). 
Corollary 2. Let p be an integer greater than 1. If G is a {P3U P2,2K, + K,}- 
free graph, then 


p+1 
LG-Ve-5+3) for 3<w(G)<pt1 
j= 
x(G) S 4) w(G) + 2p- er ee 1)(p — § + 8) for w(G) = (pt 24+k),0<k< 2-5 
j=3 

w(G) + 2p—1 : for w(G) = 3p — 2 

w(G) for w(G) > 3p—1. 
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When p = 2, 2K, + K, © diamond and hence by Theorem 6, {P3 U 
P2,diamond}-free graphs are perfect for w(G) > 5 which was shown by A. 
P. Bharathi et al., in [1]. 


Theorem 7 [1]. If G is a {P3 U Pp :,diamond}-free graph then 


4 for w(G) = 2 
x(G) < ¢ 6 for w(G) = 3 and G is perfect if w(G) > 5. 
5 for w(G) = 4 


We can further improve the bound given in Theorem 7 by obtaining a w(G)- 
coloring when w(G) = 4. 


Theorem 8. If G is a {P3 U Ps,diamond}-free graph — then 


4 for w(G) = 2 
x(G) < ¢ 6 for w(G) =3 and G is perfect if w(G) > 5. 
4 for w(G) = 4. 
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Abstract. The complexity of the list homomorphism problem for signed 
graphs appears difficult to classify. Existing results focus on special 
classes of signed graphs, such as trees [1] and reflexive signed graphs [18]. 
Irreflexive signed graphs are the heart of the problem, and Kim and Sig- 
gers have formulated a conjectured classification for these signed graphs. 
We focus on a special case of irreflexive signed graphs, namely those in 
which the unicoloured edges form a spanning path or cycle, and classify 
the complexity of list homomorphisms to these signed graphs. In partic- 
ular, our results confirm the conjecture of Kim and Siggers for this class 
of signed graphs. 


1 Motivation and Background 


We investigate the complexity of (list) homomorphism problems for signed 
graphs. The complexity of homomorphism (and list homomorphism) problems 
is a popular topic. For undirected graphs, it was shown in [16] that the prob- 
lem of deciding the existence of homomorphisms from an input graph to a fixed 
graph H is polynomial if H is bipartite or has a loop, and is NP-complete other- 
wise. For general structures H, the corresponding problem lead to the so-called 
Dichotomy Conjecture [12,17], which was only recently established [8,25]. In the 
list homomorphism problem for H, the input contains with each input graph 
also lists of allowed images for each vertex. (The precise definitions are given 
below.) The list homomorphism problems have generally a nicer behaviour than 
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the homomorphism problems, because the lists facilitate recursion to subprob- 
lems. For undirected graphs, the list homomorphism problem is polynomial if 
H is a bi-arc graph (see below), and is NP-complete otherwise [9,10]. Even for 
general structures H, where the list version is equivalent to a special case of 
the basic version, the classification for the list version was achieved a decade 
earlier [7]. 

Signed graphs are related to graphs with two symmetric binary relations; in 
addition, they are equipped with an operation of switching (explained below). 
The possibility of switching poses challenges when classifying the complexity 
of homomorphisms, as the problem no longer appears to be a homomorphism 
problem for relational structures. Nevertheless, it can be shown that it is equiv- 
alent to such a problem and hence the results from [8,25] imply that there these 
problems also enjoy a dichotomy of polynomial versus NP-complete. For homo- 
morphisms of signed graphs (without lists), a concrete dichotomy classification 
was conjectured in [4], and proved in [6]. Interestingly, for signed graphs, the 
list version no longer seems easier to classify, and the progress towards a classi- 
fication has been slow [1,4,18]. In this paper we focus on one particular class of 
signed graphs and provide a full classification of complexity of the corresponding 
homomorphism problem. In particular, our results confirm a conjecture from [18] 
for this class of signed graphs. 

A signed graph G consists of a set V(G) and two symmetric binary relations 
+,—. We also view G as a graph G with the vertex set V(G), the edge set +U— 
(the underlying graph of CG). and a mapping o : E(G) — {+,-—}, assigning a sign 
(+ or —) to each edge of G. (A loop is considered to be an edge.) Two signed 
graphs are considered (switching-) equivalent if one can be obtained from the 
other by a sequence of switchings; switching at a vertex v results in changing the 
signs of all edges incident to v. We will usually view signs of edges as colours, and 
view positive edges as blue, and negative edges as red. It will be convenient to call 
a red-blue pair of edges with the same endpoint(s) a bicoloured edge; however, 
it is important to keep in mind that formally they are two distinct edges. 

The study of signed graphs seems to have originated in [14,15], and was most 
notably advanced in the papers of Zaslavsky [20-24]. Guenin [13] pioneered the 
investigation of homomorphisms of signed graphs; see also, e.g., 5 5] and [19]. 

A homomorphism of the signed graph G to the signed graph Hisa mapping 
f :V(G) = V(4#) for which there exists a signed graph eg equivalent to G such 
that f preserves both relations + and —. A list homomorphism of G to H, with 
respect to the lists L(v) C V(H),v € V(G), is a homomorphism f of G to H 
such that f(v) € L(v) for all v € V(G). Let H be a fixed Signed graph. The 
homomorphism problem for H takes as input a signed graph G and asks whether 
there exists a homomorphism of ¢ G to H. The list homomorphism problem for H 
takes an input a signed graph G with lists L(v) C V(H) for every v €  V(G), 
and asks whether there exists a homomorphism f of the signed graph G to H 
such that f(v) € L(v) for every v € V(G). A subgraph G of the signed graph fal 
is the signed core of H if there is signed graph homomorphism f of H to G, and 
every homomorphism of the signed graph G to itself is a bijection on V(G). It is 
easy to see that the signed core of any signed graph is unique up to isomorphism 
and switching equivalence. The dichotomy classification conjectured in [4] and 
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proved in [6] is as follows. (In counting edges we count each unicoloured edge as 
one and each bicoloured edge as two.) 


Theorem 1 [6]. The homomorphism problem for the signed graph H is 
polynomial-time solvable if the signed core of H has at most two edges, and 
is NP-complete otherwise. 


In an earlier paper [3], cf. [1], we have classified the complexity of the list 
homomorphism problem for signed graphs with only unicoloured edges. A signed 
graph is balanced if it is equivalent to one without red edges (and bicoloured 
edges), and is anti-balanced if it is equivalent to one without blue edges (and 
bicoloured edges); here we view a bicoloured edge as both blue and red. We say 
that a signed graph is weakly balanced (weakly anti-balanced) if it is equivalent 
to one in which all edges are bicoloured or blue (respectively red). (Previously 
[1] we used the slightly awkward terms ‘uni-balanced’ and ‘anti-uni-balanced’.) 

Let C be a fixed circle with two specified points n and s. A bi-arc graph 
is a graph H such that each vertex v € V(H) can be associated with a pair 
of intervals N,,S, where N, contains n but not s and S, contains s but not n 
satisfying the following conditions: (i) N, intersects S$, if and only if S,, intersects 
Nw, and (ii) N, intersects S,, if and only if vw is not an edge of H. This class 
of graphs includes all interval graphs: a reflexive graph is a bi-arc graph if and 
only if it is an interval graph. Moreover, an irreflexive graph is a bi-arc graph if 
and only if it is bipartite and its complement is a circular arc graph [10]. 


Theorem 2 [3]. Suppose H is a connected signed graph without bicoloured edges. 
If the underlying graph H is a bi-arc graph, and H is balanced or anti-balanced, 
then the list homomorphism problem for H is polynomial-time solvable. Other- 
wise, the problem is NP-complete. 


Additionally, in [1] we have classified the complexity of the list homomor- 
phism problems for signed trees. The general classification is quite technical, 
but we will give a simplified description in the special case of irreflexive trees 1. 
The following concept plays an important role. Let U,D be two walks in H of 
equal length. Suppose U has vertices u = ug, U1,...,Ux =v, and D has vertices 
u = do,di,...,d, = v. We say that (U, D) is a chain, provided uu, dg_1v are uni- 
coloured edges and ud 1, uz_1v are bicoloured edges, and for each i, 1 <i < k—2, 
we have (1) both u;uj+1 and d;dj+1 are edges of H while djuj+1 is not an edge 
of H, or (2) both ujuj+1 and djdj41 are bicoloured edges of H while djuj41 is 
not a bicoloured edge of Hi. 


Theorem 3 [1]. If a signed graph H contains a chain, then the list homomor- 
phism problem for H is NP-complete. 


Figure 1 shows some important signed trees with a chain. 

An invertible pair in an undirected graph H is a pair of vertices a,b, with 
two walks U, D of the same length, where U has vertices a = ug, U1,.-.,Uk = 
b,Up4+1,---,Ue = a, and D has vertices b = do, di,...,dx = a, dx4i,...,d¢ = B, 
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Fig. 1. The family F of signed graphs yielding NP-complete problems, and a chain in 
each. (The figure appeared first in [1].) (Color figure online) 


such that for each 7, 1 <i < k—2, both ujuj+1 and d;d;,1 are edges of H, while 
djuj+1 is not an edge of H. For simplicity we say that a signed graph has an 
invertible pair if its underlying graph has an invertible pair. It follows from [1,9] 
that we have the following observation. 


Theorem 4. /f H has an invertible pair, then the list homomorphism problem 
for H is NP-complete. 


Figure2 shows the graph F\, with an invertible pair 1,10. The walks U, D 
begin as indicated, then continue from 7,10 to 7,1 in a similar manner, and then 
to 10,1, and similarly for the second half, from 10,1 to 1,10. 


b= 10 U=1- 2— 38- 4- 5-6 -7 
9 D=10—9-10—9-10—9-— 10 
8 


@=12 3 4 5 6 7 
Fig. 2. The graph F), with an invertible pair. 
The following result from [1] therefore implies that for irreflexive signed trees, 


the only NP-complete cases have a chain or an invertible pair. (We will make 
the same conclusion about the irreflexive signed graphs discussed in this paper.) 
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Theorem 5 [1]. Let H be an irreflexive tree. If the underlying graph of H 
contains Fy or H contains a signed graph from the family F, as an induced 
subgraph, then the list homomorphism problem to H is NP-complete. Otherwise, 
HT admits a special min ordering and the problem is polynomial-time solvable. 


Further progress on the list homomorphism problem for signed graphs can be 
made by transforming the list homomorphism problem for the signed graph H to 
a list homomorphism problem for an auxiliary structure with two binary relations 
red, blue. (In such a structure we do not allow switchings.) We call a structure 
with two binary relations red, blue an edge-coloured graph. The switching graph 
S S(H iT) of H is an edge-coloured graph with two vertices v1, v2 for each vertex v of 
H, and each edge vw of H yields edges vj wy 1, v2W2 of the same colour as vw and 
edges Vi W2,v2W, of the opposite colour. (This definition applies also for loops, 
i.e., when v = w.) Each homomorphism of the signed graph G to the signed 
graph H corresponds to a homomorphism of the edge-coloured graph G to the 
edge-coloured graph $(f) and conversely. If G has lists L(v),v € V(G), then 
the new lists L+(v),v € V(G), for S(A) are defined as follows: for any x € L(v) 
with v € V(G), we place both 2; and x2 in Lt(v). It is easy to see that the 
signed graph G has a list homomorphism to the signed graph S(H H) with respect 
to the lists L if and only if the edge-coloured graph G has a list homomorphism 
to the edge-coloured graph S(H) with respect to the lists D+. The new lists 
L* are symmetric sets in H*, meaning that for any « € V(H),v € V(G), 
we have x, € L*(v) if and only if we have rg € LT(v). Thus we obtain the 
list homomorphism problem for the edge-coloured graph_ S(H 1), restricted to 
input instances G with lists L that are symmetric in S (H 1). We shall call the 
corresponding vertices x1, 22 mates. 

A polymorphism of an edge-coloured graph H is a homomorphism f of 
some power H® to H, ice., a function f that assigns to each ordered t-tuple 
(U1, V2,-.--, Uz) of vertices of H a vertex f(v1,v2,..., Vz) such that two coordinate- 
wise tuples adjacent in blue (red) obtain images adjacent in blue (red). A poly- 
morphism of order t = 3 is a majority if f(v,v,w) = f(v,w,v) = f(w,v,v) =v 
for all v,w. A Siggers polymorphism is a polymorphism of order t = 4, if 
f(a,r,e,a) = f(r,a,r,e) for all a,r,e. One formulation of the dichotomy the- 
orem proved by Bulatov [8] and Zhuk [25] states that the constraint satisfac- 
tion problem for the template H is polynomial-time solvable if H admits a 
Siggers polymorphism, and is NP-complete otherwise. (Other equivalent ver- 
sions refer to other useful polymorphisms, notably weak near-unanimity poly- 
morphisms [8, 18,25].) Majority polymorphisms are less powerful, but it is known 
(see [12,17]) that if H admits a majority then the constraint satisfaction problem 
for the template H is polynomial-time solvable. We say that a polymorphism is 


conservative if f(v,,v2,...,Uz) is always one of v1, V2,..., U4, and we say that a 
polymorphism of $(H) is semi-conservative if f(v1, v2,...,U;) is always one of 
U1, V2,...,U~ or their mates. 


To distinguish the two parts of a bipartite graph we speak of black and white 
vertices. A min ordering of a bipartite edge-coloured graph H is a pair <p, <w, 
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where <, is a linear ordering of the black vertices and <,, is a linear ordering 
of the white vertices, such that for white vertices 7 <, x’ and black vertices 
y <p y’, if zy’, x’y are both red (blue) edges in H, then zy is also a red (blue) 
edge in H. It is known [12] that if a bipartite graph H has a min ordering, then 
the list homomorphism problem for H can be solved in polynomial time. In fact, 
min ordering can be viewed as a polymorphism of order t = 2 [12]. We call a 
bipartite min ordering of the signed irreflexive tree H special if for black vertices 
x,x’ and white vertices y,y’, if xy is bicoloured and zy’ is unicoloured, then 
Y <w y’, and if xy is bicoloured and z'y is unicoloured, then 2 <, x’. In other 
words, the bicoloured neighbours of any vertex appear before its unicoloured 
neighbours. 
For weakly balanced irreflexive signed graphs, [18] suggests the following. 


Conjecture. For a weakly balanced irreflexive signed graph H, the list homo- 
morphism problem is polynomial-time solvable if H has a special min order- 
ing; otherwise H contains a chain or an invertible pair and the problem is NP- 
complete. 


We note that in [18], authors prove that the existence of a special min order- 
ing implies the existence of a semi-conservative majority which means that the 
problem is polynomial-time solvable; so to confirm their conjecture it remains 
only to prove the remaining cases are NP-complete. m 

Theorem 5 from [1] confirms the above conjecture, when H is a signed tree. 

We say that an irreflexive signed graph H is path-separable (cycle-separable) 
if the unicoloured edges of H form a spanning path (cycle) in the underlying 
graph of H. For brevity we also say a signed graph is separable if it is path- 
separable or cycle-separable. In this paper we explicitly classify_the complexity 
of the list homomorphism problem for separable signed graphs H, see Theorems 
6 and 7. The descriptions suggest that the polynomial cases are rather rare and 
very nicely structured. 

In particular, we confirm the above conjecture in the special case of separable 
signed graphs. Moreover, in our results we do not assume that H is weakly 
balanced. 


2 Path-Separable Signed Graphs 


Irreflexive signed graphs are in a sense the core of the problem. By Theorem 1, 
the list homomorphism problem for H is NP-complete unless the underlying 
graph H is bipartite. There is a natural transformation of each general problem 
to a problem for a bipartite irreflexive signed graph, akin to what is done for 
unsigned graphs in [11]; this is nicely explained in [18]. 

However, for bipartite H, we don’t have a combinatorial classification beyond 
the case of trees H, except in the case H has no bicoloured edges or loops 
(when Theorem 2 applies), or when H has no unicoloured edges or loops (when 
the problem essentially concerns unsigned graphs and thus is solved by [11)). 
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Therefore we may assume that both bicoloured and unicoloured edges or loops 
are present. We focus in this paper on those bipartite irreflexive signed graphs H 
in which the unicoloured edges form simple structures, such as paths and cycles. 
In this section, we consider irreflexive signed graphs in which the unicoloured 
edges form a spanning path. 7 

Recall that an irreflexive signed graph H is path-separable if the unicoloured 
edges of H form a hamiltonian path P in the underlying graph H. We may 
assume the edges of P are all blue. In other words, all the edges of the hamil- 
tonian path P are blue, and all the other edges of H are bicoloured. Recall 
that the distinction between unicoloured and bicoloured edges is independent of 
switching, thus such a hamiltonian path P = vi v2...Un is unique, if it exists. 

We first observe that for any irreflexive signed graph H, the list homomor- 
phism problem for H is NP-complete if the underlying graph A contains an odd 
cycle, since then the s-core of H has at least three edges. Moreover, we now show 
that the list homomorphism problem for H is also NP-complete if H contains 
an induced cycle of length greater than four. Indeed, it suffices to prove this if 
HT is an even cycle of length k > 4. If all edges of H are unicoloured, then the 
problem is NP-complete by Theorem 2, since an irreflexive cycle of length k > 4 
is not a bi-arc graph. If all edges of the cycle H are bicoloured, then we can easily 
reduce from the previous case. If H contains both unicoloured and bicoloured 
edges, then H contains an induced subgraph of type a) or b) in the family F in 
Fig. 1, and the problem is NP-complete by Theorem 3. (There are cases when 
the subgraphs are not induced, but the chains from the proof of Theorem 3 are 
still applicable.) 

We further identify two additional cases of H with NP-complete list homo- 
morphism problems. An alternating 4-cycle is a 4-cycle v1,v2v3v4 in which the 
edges v v2, U3V4 are bicoloured and the edges v2v3,v4v; unicoloured. A 4-cycle 
pair consists of 4-cycles v,v2gv3u4 and v1, u5UgU7, Sharing the vertex v;, in which 
the edges v1 v2, v1Us5 are bicoloured, and all other edges are unicoloured. An alter- 
nating 4-cycle has the chain U = v1, v4, v3; D = v1, v2, v3, and a 4-cycle pair has 
the chain U = vj, 04, v3, V2, 01; D = 11, U5, U6, U7, 11. Therefore, if a signed graph 
H contains an alternating 4-cycle or a 4-cycle pair as an induced subgraph, then 
the list homomorphism problem for H is NP-complete. Note that the latter chain 
requires only vgvg and v3us5 to be non-edges. The problem remains NP-complete 
as long as these edges are absent; all other edges with endpoints in different 
4-cycles can be present. If both vavg and v3us are bicoloured edges, then there 
is an alternating 4-cycle v2v3v5ve. Thus we conclude that the problem is NP- 
complete if H contains a 4-cycle pair as a subgraph (not necessarily induced), 
unless exactly one of vgv5 or v2u¢ is a bicoloured edge. 

From now on we will assume that H is a path-separable signed graph with the 
unicoloured edges (all blue) forming a hamiltonian path P = v1,...,v,. We will 
assume further that the list homomorphism problem for H is not NP-complete, 
and derive information on the structure of H. In particular, the underlying graph 
FH is bipartite and does not contain any induced cycles of length greater than 4, 
and H does not contain an alternating 4-cycle or a 4-cycle pair; more generally, 
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H does not contain a chain. If H has no bicoloured edges (and hence no edges 
not on P), then the list homomorphism problem for Hi is polynomially solvable 
by Theorem 2, since a path is a bi-arc graph. If there is a bicoloured edge in H ; 
then we may assume there is an edge v;v;,3, otherwise there is an induced cycle 
of length greater than 4. 

A block in a path-separable signed graph Hisa subpath v;0;410;+420i+3 of P, 
with the bicoloured edge v;v;43. The previous paragraph concluded that H must 
contain a block. Note that if vjuj41u;+2vi+3 is a block, then vj410j420i43Vi44 Can- 
not be a block: in fact, vj41v;+4 cannot be a bicoloured edge, otherwise H would 
contain an alternating 4-cycle. However, vj420;43Ui44Vi45 Can again be a block 
and so can vj440j45Vi46Ui+7, etc. If both vjujz,u;pevi43 and Vj+2¥j43Vi44Vi45 
are blocks then v;v;45 must be a bicoloured edge, otherwise v;0;430i42Vi+5 would 


induce a signed graph of type a) in family F from Fig.1. A segment in Hisa 
maximal subpath vju;41 ... vit2j41 of P ee j = 1 that has all bicoloured edges 
UiteVi+e+3, Where e is even, 0 < e < 27 —2. (A maximal subpath is not properly 
contained in another such subpathn Thus each subpath vj +¢Vite+41Vite+2Vite+3 
of the segment is a block, and the segment is a consecutively intersecting sequence 
of blocks; note that it can consist of just one block. Two segments can touch as 
the second and third segment in Fig. 3, or leave a gap as the first and second 
segment in the same figure. 

In a segment v;0j41...Vi-2j+1 we call each vertex vite with 0 < e < 27 —2 
a forward source, and each vertex vj4o with 3 < o < 27 +1 a backward source. 
Thus forward sources are the beginning vertices of blocks in the segment, and the 
backward sources are the ends of blocks in the segment. If a < b, we say the edge 
Uap is a forward edge from vq and a backward edge from vp. In this terminology, 
each forward source has a forward edge to its corresponding backward source. 
Because of the absence of a signed graph of type a) in family F¥ from Fig. 1, we 
can in fact conclude, by the same argument as in the previous paragraph, that 
each forward source in a segment has forward edges to all backward sources in 
the segment. 

We say that a segment u,vi41...Vitoj41 is right-leaning if vizeVite+o IS a 
bicoloured edge for all e is even, 0 < e < 27 — 2, and all odd o > 3; and 
we say it is left-leaning if vi+2;+1-eVi4+2j+1-e—o is a bicoloured edge for all e 
even, 0 < e < 27 —2 and all odd o > 3. Thus a in a right-leaning segment 
each forward source has all possible forward edges (that is, all edges to vertices 
of opposite colour in the bipartition, including vertices with subscripts greater 
than i + 2j +1). The concepts of left-leaning segments, backward sources and 
backward edges are defined similarly. 

We say that a path-separable signed graph F is right-segmented if all seg- 
ments are right-leaning, and there are no edges other than those mandated by 
this fact. In other words, each forward source has all possible forward edges, and 
each vertex which is not a forward source has no forward edges. Similarly, we 
say that a path-separable signed graph H is left-segmented if all segments are 
left-leaning, and there are no edges other than those mandated by this fact. In 
other words, each backward source has all possible backward edges, and each 
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to all black vertices to all white to all black vertices 
predates, vertices 


Fig. 3. An example of a left-right-segmented signed graph. The additional bicoloured 
edges from all white vertices before vi2 to all black vertices after vi5 are not shown. 
(The figure appeared first in [1].) (Color figure online) 


vertex which is not a backward source has no backward edges. Finally, H is 
left-right-segmented if there is a unique segment vjv;41.-. Uit2;41 that is both 
left-leaning and right-leaning, all segments preceding it are left-leaning, all seg- 
ments following it are right-leaning, and moreover there are additional bicoloured 
edges vj_eVi+2j+0 for all even e > 2 and all odd o > 3, but no other edges. In 
other words, vertices v1, v2,...,Vi+2j41 induce a left-segmented graph, vertices 
Ui, Vit1,+++,;Un induce a right-segmented graph, and in addition to the edges 
this requires there are all the edges joining v;_. from v1,...,u;-1 to Ui+>o from 
Uj+2j4+2)+++,Un, With even e and odd o. A segmented graph is a path-separable 
signed graph that is right-segmented or left-segmented or left-right-segmented. 
See Fig.3 there are three segments, the left-leaning segment vsugv7UgU9V10, 
the left- and right-leaning segment v12v13U14U15, and the right-leaning segment 
V15V16V17U18V19V29. Thus this is a left-right-segmented signed graph. 


Theorem 6. Let ra be a path-separable signed graph. Then the list homomor- 
phism problem for F is polynomial-time solvable if H is switching equivalent to 
a segmented signed graph H. Otherwise, the problem is NP-complete. 


The NP-completeness is proved in the journal version. To show that the 
problem is polynomial when H is a segmented signed graph, we use the result 
from [18] which asserts that the existence of a special min ordering ensures the 
existence of a polynomial-time algorithm. (We provide an alternate proof in [2].) 

We now describe a special min ordering of the vertices for the case of a right- 
segmented signed graph. Consider two vertices vz and vy with even subscripts 
x < y, and each with a forward bicoloured edge. Then all forward neighbours 
of vy are also forward neighbours of vz, and all backward neighbours of vz are 
also backward neighbours of vy. Vertices vz with no forward bicoloured edges 
have backward edges to all vertices with forward bicoloured edges from vertices 
uv, with t odd and t < z. We now order the vertices with even subscripts as 
follows: first we take vertices with forward bicoloured edges in increasing order 
of subscripts, then we take the remaining vertices in the decreasing order of 
subscripts. The same ordering is applied on the vertices with odd subscripts. 
It now follows from our observations that this is a min ordering, and for every 
vertex the bicoloured edges come before the unicoloured edges. 

For left-segmented graphs the ordering is similar; for left-right-segmented 
graphs the ordering is described in the journal version. 
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3  Cycle-Separable Signed Graphs 


As an application of Theorem 6, we now consider irreflexive signed graphs in 
which the unicoloured edges form a spanning cycle. We say that an irreflexive 
signed graph H is cycle-separable if the unicoloured edges of H form a hamil- 
tonian cycle C in the underlying graph H. In other words, we have a hamilto- 
nian cycle C whose edges are all unicoloured, and all the other edges of Hi are 
bicoloured. In contrast to the path-separable signed graphs, we cannot assume 
the edges of C' are all blue, see below. 

We first introduce three cycle-separable signed graphs for which the list 
homomorphism problem will turn out to be polynomial-time solvable. The signed 
graph Ho is the 4-cycle with all edges unicoloured blue. The signed graph Hy 
consists of a blue path 6 = to, ti, te, t3 = w,a red path 6, 51, 52, w, together with 
a bicoloured edge bw. The signed graph Hy consists of a blue path b, 51, 52, w 
a blue path b = to, ti, ta,...,te = w (with @ > 3 odd), and all bicoloured edges 
t,t; with even i and odd j,j > i+ 1. (Note that this includes the edge bw.) 
These three cycle separable signed graphs are illustrated in Fig. 4. Note that if 
the subscript @ is greater than 0, then it is odd. Moreover, both H, and H3 have 
6 vertices and differ only in the colours of the unicoloured edges forming the 
hamiltonian cycle C: H; has the cycle C unbalanced, and H3 has the cycle C 
balanced. 


— 


A 


Fig. 4. The cycle-separable signed graphs Ao, A,, and Ay with ¢ > 3 odd. 


Theorem 7. Let H be_a cycle-separable signed graph. Then the the list homo- 
morphism problem for H "is polynomial-time solvable if H is switching equivalent 
to Ho, or to Ai, or to H, for some odd € > 3. Otherwise, the problem is NP- 
complete. 


The NP-complete cases are again found in the journal version. Here we show 
that the list homomorphism problem for H can be solved in polynomial time for 
all remaining cycle-separable signed _ graphs. 

__ If # is switching equivalent to Ho, then the list homomorphism problem for 
FH is polynomial-time solvable by Theorem 2. 


32 J. Bok et al. 


Let G together with lists L be an instance of ListT-S- Hom(H H). We may 
assume G is connected and bipartite. We will call the vertices of parts of biparti- 
tion in G black and white as well. First, we try mapping the black vertices of G 
to the black vertices of a . If that fails, we try mapping the white vertices of G 
to the black vertices of H. In the former case we remove all white (respectively 
black) vertices from the lists of the black (respectively white) vertices in G. The 
latter case is analogous. 

First, we preform the arc consistency procedure (cf. [9]) and also the arc 
consistency procedure for bicoloured edges. If there is a vertex with empty list 
after this step, then no suitable list homomorphism exists. Otherwise, we define 
two mappings f; and f2 as follows. 


— fi(v) =min{t; : t; € L(v)}, 
— fo(v) =min{s; : s; € L(v)}. 


(Observe when L(v) consists of black vertices f;(v) is the vertex, t; or s; for 7 = 1 
or j = 2, with the smallest index, and conversely for the case L(v) consists of 
white vertices f;(v) is the vertex with the largest index.) Let uv be a bicoloured 
edge of G with u black and v white. Then by arc consistency there is a bicoloured 
edge between a vertex from L(u) and a vertex from L(v). By the labelling of 
H, it has the form tajte;+3, where i < j. By our observation, fi(u) = toy 
where i’ < i. Similarly, fi(v) = tej/43 where j’ > j. This implies i’ < 7’ and 
consequently, f1(u) fo(v) is a bicoloured edge. A similar argument applies for fo. 
Similarly, if uv is a unicoloured edge in é with u black and v white, then there 
is a (possibly bicoloured) edge tg;t2;41 in H where to; € L(u) and ty;41 € L(v) 
with 7 j —1. Again fi(u) = tay with i < 7 and fi(v) = taj/41 with gj! = j. 
This implies ta;t2;'41 is an edge of H. We conclude that both fi and fo are 
list homomorphisms from G' to H (the underlying graphs) with the additional 
property that vertices adjacent by bicoloured edges in é map to vertices adjacent 
by bicoloured edges in H. We now examine the signs of the unicoloured edges 
and determine the switchings required to define a list homomorphism from G to 
H. We make the following key observation. Due to the ordering on the vertices 
if, for example, f1(u)f1(v) is a unicoloured edge in H (again wu is black and v 
is white), then under no list homomorphism of Gto H (with lists L) does wu 
map to a bicoloured edge. (If such a mapping did exist, then the bicoloured edge 
would remain as a possible image during the consistency check, and a bicoloured 
edge would have end points occurring first in the ordering <.) 

If b € L(v) for some black vertex v, then fi(v) = fo(v) = b. Analogously, 
if w € L(v) for some white vertex v, then fi(v) = fo(v) = w. That is, any 
vertex that can map to b (respectively w) will be mapped to b (respectively 
w). Moreover, when examining the resigning of vertices (below), if there is no 
resigning that works when v maps to b, then there is no homomorphism at all, as 
b dominates all white vertices (in H). Similarly w dominates all black vertices. 

Consequently, we can partition the vertices of G into those mapped to b 
or w under f; and under f2 and those vertices that can only map to interior 
vertices of the two segments, i.e. to t1,...,t¢-1 and 51,52. The vertices in the 


List Homomorphisms to Separable Signed Graphs 33 


pre-images f; '(b) = fo '(b) and fr, '(w v) = fx '(w) are called boundary vertices. 
Removing the boundary vertices from G leaves a union of components. Consider 
such a component KK. The subgraph of G induced by K is called a region (similar 
to [1]). For each region, either its vertices all map to s-vertices or all to t-vertices. 

We now examine how to test if there is a switching of the boundary vertices 
of G so that each region maps to H. 

First suppose that H is switching equivalent to Hy with odd @ > 3. Let K 
be some region of G. If the lists of vertices of K contain only s-vertices or only 
t-vertices, then there is no choice and we will use mappings fo or f, respectively. 
Now suppose that the lists of vertices of k contains both s-vertices and t-vertices. 
We claim we can use f; to map K to H. If there is any list homomorphism 
G — H under which K maps to s-vertices, then there is a switching of G such 
that K, together with its boundary vertices, induces a subgraph having only 
blue edges. (Recall b, 51, s2,w is a blue path.) The mapping f; restricted to K 
and its boundary vertices is a homomorphism of G to H. As each edge in the 
segment containing the_t-vertices is at least blue, f; is a homomorphism of the 
induced subgraph to H. Thus for any region that has both s-vertices and t- 
vertices in its lists after the consistency checks, we may assume if is mapped H 
under f;. The remaining regions must map to s-vertices and we may assume 
they are mapped using f2. Moreover, by our key observation above, the edges 
mapping to unicoloured edges under these mappings must map to unicoloured 
edges under any mapping. In particular, the discovery of a cycle consisting of 
unicoloured edges in G whose sign does not agree with the sign of its image under 
our use of f; and fs certifies that G is a no-instance of the problem. 

It now remains to determine the switching of boundary vertices to ensure 
the signs of all unicoloured edges are positive. We describe this in detail in the 
journal version. 


4 Conclusions 


It seems difficult to give a full combinatorial classification of the complexity of 
list homomorphism problems for general signed graphs. For irreflexive signed 
graphs, which are in a sense the core of the problem, there is a conjectured 
classification in [18]. We have obtained a full dichotomy classification in the 
special case of separable irreflexive signed graphs. The classification confirms 
the dichotomy conjecture of [18] for this case, and also confirms that the only 
polynomial cases enjoy a special min ordering and the only NP-complete cases 
have chains or invertible pairs, as also conjectured in [18]. 
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Abstract. The general position problem for graphs stems from a puzzle 
of Dudeney and the general position problem from discrete geometry. The 
general position number of a graph G is the size of the largest set of ver- 
tices S' such that no geodesic of G contains more than two elements of S. 
The monophonic position number of a graph is defined similarly, but with 
‘induced path’ in place of ‘geodesic’. In this abstract we discuss the small- 
est possible order of a graph with given general and monophonic position 
numbers, determine the asymptotic order of the largest size of a graph 
with given order and position numbers and finally determine the possible 
diameters of a graph with given order and monophonic position number. 


Keywords: General position - Monophonic position - Turan 
problems - Size - Diameter - Induced path 


1 Introduction 


In this paper all graphs will be taken to be simple and undirected. The order of 
the graph G will be denoted by n and its size by m. The clique number w(G) of 
a graph G is the order of the largest clique in G. An independent union of cliques 
in a graph G is an induced subgraph H of G such that every component of H is a 
clique. The independent clique number a” (G) is the order of a largest independent 
union of cliques in G. The distance d(u, v) between two vertices u and v in a graph 
G is the length of the shortest path in G from u to v and a shortest path is called a 
geodesic. An induced or monophonic path is a path without any chords. The join 
of two graphs G and H is the graph G V H obtained from the disjoint union of G 
and H by joining every vertex of G to every vertex of H. 

The general position problem for graphs can be traced back to one of the 
many puzzles of Dudeney [7]. This problem was introduced in the context of 
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graph theory independently in [4,13]. A set S of vertices of a graph G is in 
general position if no geodesic of G contains more than two points of $; in this 
case $ is a general position set, or a gp-set. The general position problem asks for 
the largest possible size of a gp-set for a given graph G; this number is denoted 
by gp(G). A characterisation of the structure of a gp-set is derived in [2]. Other 
recent papers on the general position problem include [8—10, 14]. 

In [15] the authors introduced the monophonic position number of a graph, or 
mp-number for short. A set S of vertices in a graph G is in monophonic position 
if there is no monophonic path in G that contains more than two elements of S. 
A set satisfying this condition is called a monophonic position set or simply an 
mp-set. The size of a largest mp-set in a graph G is the mp-number mp(G) of 
G. The mp- and gp-numbers of trees have a particularly simple form. 


Lemma 1 [4,15]. For any tree T with €(T) leaves we have mp(T) = gp(T) = 
eT). 


In Sect. 2 we discuss the problem of finding the smallest possible order of a 
graph with given mp- and gp-numbers. In Sect.3 we introduce the Turdn-type 
problem of the largest possible size of a graph with given order and mp-number. 
We solve this problem asymptotically and present exact values for mp-number 
two, along with a classification of the extremal graphs. Finally in Sect.4 we 
consider the problem of the possible diameters of graphs with given order and 
mp-number. 


2 The Smallest Graph with Given mp- and gp-Numbers 


In a previous paper the authors characterised the values of a,b € N such that 
there exists a graph with mp-number a and gp-number b. 


Theorem 1 [15]. For all a,b € N there exists a graph with mp-number a and 
gp-number b if and only if2<a<bora=b=1. 


This raises the question: for 2 < a < b what are the possible values of the order of 
a graph with mp-number a and gp-number b? In particular, what is the smallest 
such order? We give strong bounds on this order and solve the problem for a 
certain range of a and b. For all a,b € N such that 2 < a < b the order of the 
smallest graph G with mp(G) = a and gp(G) = 0 will be denoted by (a,b). 
Trivially for a > 2 we have p(a,a) = a. The following lower bound on ju(a, b) for 
a < b can be derived by considering the points lying on a longest induced path 
that is not a geodesic. 


Lemma 2. For 2 <a <b we have p(a,b) > b+2. 


For r > 3 we define the pagoda graph Pag(r) as follows. The vertex set 
of Pag(r) consists of three sets A = {a1,@9,...,a,},B = {bi,...,b-},C = 
{c1,...,¢,} of size r and an additional vertex x. For 1 < i < r we set b; to 
be adjacent to a; and c; for j #7 and also add an edge from x to every vertex 
of C. Pag(4) is illustrated in Fig. 1. 


38 J. Tuite et al. 


Fig. 1. Pag(4) 


Lemma 3. For r >3 we have mp(Pag(r)) = 2 and gp(Pag(r)) = 2r. 


Proof. The order of Pag(r) is n = 3r +1. We now show that for r > 3 we 
have mp(Pag(r)) = 2 and gp(Pag(r)) = 2r. Trivially any two vertices constitute 
an mp-set and AUC is a gp-set of size 2r. It therefore suffices to show that 
mp(Pag(r)) < 2 and gp(Pag(r)) < 2r. 

For a contradiction, let M be an mp-set in Pag(r) with size > 3. For 1 < i,j < 
rand i # j let P;,; be the monophonic path aj, bj, ¢, 7, cj, bi, aj. The existence 
of this path shows that if c © M, then M cannot contain two other vertices of 
Pag(r), so we can assume that « ¢ M. Suppose that M contains two vertices 
from the same ‘layer’ A, B or C. For the sake of argument say a,,a2 € M; the 
other cases are similar. The path P;,2 shows that by, b2,c1,co ¢ M. If another 
element of A, say a3, belonged to M, then we would have the monophonic 
path aj, b2, a3, bj, a2, a contradiction. For 3 <7 <r the path aj, bj, a2 is trivially 
monophonic, so MN B = @. It follows that there must be a point c; € M for some 
3 <i<r. However ay, b2,c;,61,a2 is a monophonic path, another contradiction. 
As M cannot contain > 3 points of Pag(r) we obtain the necessary inequality. 

Now assume that K is any gp-set in Pag(r) with size > 2r. For 1 <i,j,k <r, 
i € {i,k}, let Qij,4 be the geodesic a;,b;,c,,x. Suppose that « € K. If also 
KNC#@, say c, € K, then as c1,x,¢; is a geodesic for 2 <i <r, it follows 
that KA {c2,...,c¢-} = @. Also, letting j ¢ {1,7} in the path Q;,;,1 shows that 
KOA= KO (B —- {bi}) = @, so that we would have |K| < 3 < 2r. Hence 
KOC = ©. Furthermore if some 0; lies in K, then for 1 <i,k <r andj ¢ {i,k} 
the geodesic Q;,;,4 contains x,b; and a;, so that we would have kK C BU {a;, x} 
and |K| <r+2 < 2r. Therefore kK C AU {a} and |K| < r+1 < 2r. Therefore 
x is not contained in any gp-set of Pag(r) of size > 2r. 

Suppose now that kK B+ @, say b; € K. For 2 <i,k <r the existence of 
the geodesic Qi1,, shows that K cannot intersect both A — {a,} and C — {cy}. 
Therefore if || > 2r+1, K must either have the form AUBU{c;} or {a, }}UBUC; 
however, a1, b,c, is a geodesic that contains three points from both of these 
sets. It follows that |K| < 2r. Furthermore, if |AK M B| > 2, say b2 € K, then 
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KN ((A— {a1, a2}) U(C — {c1, c2})) = @ and also K cannot contain both a; and 
c; for i = 1,2, so that |K| would be bounded above by r + 2, which is strictly 
less than 2r, whereas if K contains a unique vertex of B, then again |K| < r+ 2; 
therefore AU C is the unique gp-set in Pag(r) with size 2r. 

In a similar fashion it can be shown that the graphs Pag’ (r) = Pag(r) — {a,} 
have order 3r, mp-number 2 and gp-number 2r — 1 for r > 3. 


We now define a second family of graphs. We need the following result on 
the position numbers of the join of graphs. 


Lemma 4 [15]. The mp- and gp-numbers of the join GV H of graphs G and 
FT satisfies 


mp(G V #) = max{w(G) + w(H), mp(G), mp(#)} 


and 
ep(G V H) = max{w(G) + w(A), a” (G), a’ (A)}. 


It follows from Lemmas 1 and 4 that if T is any tree with ¢(T) > 3 leaves, 
then mp(A, V T) = &(T), whilst gp(A, V T) = a“(T). If T is a starlike tree 
with r branches of length one and s branches of length two, then Ky, V T has 
order r+ 2s +2, mp-number r+ s and gp-number r + 2s; the gp-number of this 
graph matches the lower bound in Lemma 2. It follows that if 3 < a < 6 and 
8 <a, then p(a,b) = b+ 2; furthermore for this range there exists a graph with 
mp-number a, gp-number b and order n if and only ifn > b+ 2. 

More generally, if we allow one branch of the starlike tree to have length 
longer than two, then our constructions yield the following upper bounds. 


Theorem 2. — p(2,3) = 5 and for b > 4 we have p(2,b) < [2] +1, with 
equality for 4d<b<8. 

~ For 3<a<b and § <a we have p(a,b) =b+2. 

~ For3<a<% we have p(a,b) <b-—a+2+ [8]. 


We conjecture the bounds in Theorem 2 to be best possible. This has been 
confirmed by computation for most pairs (a,b) between 2 and 11 [1]]. 


3 The Largest Size of Graphs with Given Order 
and Position Numbers 


A Turan-type problem asks for the largest possible size of a graph with order n 
that contains no subgraph isomorphic to a graph from a family F of forbidden 
subgraphs. The first such result was proved by Mantel, who showed in [12] that 
the largest possible size of a triangle-free graph with order n is (=, the unique 
extremal graph being the complete bipartite graph K L2),021- This result was 
generalised by Turan as follows. 
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Theorem 3 [16]. The number of edges of a Ka+1-free graph H is at most 


("a") re-n("3"). 


where r = |" |. Equality holds if and only if H is isomorphic to the Turdn graph 
Tn,a, which is the complete a-partite graph with every partite set of size |2| or 


[al 


We will denote the size of the Turan graph Ty,.q by tn. We now discuss 
some Turdn-type problems for the general and monophonic position numbers. 
For a > 2 and n > a we define mex(n; a) (respectively gex(n; a)) to be the largest 
possible size of a graph with order n and mp-number (resp. gp-number) a. Both 
of these numbers are defined for n > a by Theorem 1. The only graphs with gp- 
number two are the cycle C4 and paths, so for n > 5 we have gex(n; 2) = n—1. We 
can derive a quadratic upper bound for mex(n; a) by an elementary application 
of Turan’s Theorem. To reduce the bound slightly we will need a lemma on the 
mp-numbers of complete multipartite graphs that extends the result on complete 
bipartite graphs from [15]. 


Lemma 5. For integers r) > ro > ++: > ry the mp-number and gp-number of 
the complete multipartite graph Ky, r5,....7, are given by 


gP( Koy ro,...sre) = mp(Kry rs,...re) = max{r1, t}. 


Proof. Let the partite sets of Ky, r5,....7, be Wi,W2,...,Ws and let M be a 
maximum mp-set of K;., -.,....r,- Suppose that M contains two vertices ui, ug in 
the same partite set W. Then M cannot contain any vertex v in any other partite 
set, for ui, v, U2 is a monophonic path. Hence in this case |M| < |W| < ri. If M 
contains at most one vertex from every partite set then |M| < t. 

For the converse, observe that K;,.,,,....r, contains a clique of size t, so that 
|M| > t. Each partite set is also an mp-set, so that |M| > 11. The proof for 
gp-sets is identical. 


Lemma 6. Fora<n< a? 
mex(n}; @) = tna; 


but forn > a?+1 we have the strict inequalities mex(n; a) < tn, and gex(n; a) < 
tna 

Proof. Any clique is in monophonic position. Thus if mp(G) = a, then G is 
Ka+1-free and the conclusion follows from Turdn’s Theorem and Lemma 5. 


We now show that the upper bound given in Lemma 6 is asymptotically 
tight. 


Theorem 4. Fora>2 andn>a?+1 we have 


tra — 1" (5) < mex(n; a) < tna — 1. 
a \2 
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Proof. Take the Turan graph T;,,_ and label the partite sets T,72,...,Ta, where 
|Z;| = [2] for 1 <7 < 8 and |7;| = [4] fors+1<i<a, wheres=n-—|$la. 
For 1 < i < a we denote the vertices of T; by wy, where 1 <j < ["] ifi<s 
and L 7 =| 2] fe+1st< ae. 

For each j in the range 1 < j < |" delete the edges of the clique of order a 
on the vertices uj, 1 <i <a, from T),,,. This yields a new graph De with size 
tna — LE) (2). 

Considering the vertices u,;; for 1 <7 <a, we see that TF, contains a clique 
of size a, so certainly mp(T;,) = a. For the converse, let M be a largest mp-set 
of Ty, 4. Suppose that M contains two vertices uj; and wiz from the same partite 
set T;. Then M cannot contain a vertex uj; from a different partite set Tj, where 
J’ E{9,k}, as Uiz, Wij’, Wik iS a monophonic path. We have mex(5; 2) = 5, so the 
lower bound holds for n = 5 and a = 2; otherwise for n > a?+1 we have |#]| > 3. 
Hence let 1 <1 < [2] and 1 ¢ {j,k}. Then for any 7’ € {1,2,...,a} — {2} the 
path P = wyj, Uk, Wit, Wij, Wik Is Monophonic, so M C V(T;). However the path 
P shows that no three points of the partite set T; can all lie in M, so that 
|M| < 2. Therefore we can assume that any optimal mp-set has at most one 
point in each partite set, so that mp(T7,,) < a, completing the proof. 


Theorem 4 shows that the asymptotic order of mex(n; a) is $(1 — 4)n? + O(n). 
For a = 2 we were able to push our result further to give an exact formula 
for mex(n;2) and classify the extremal graphs. We omit the lengthy proof, 
which makes use of Turan stability results from [3] and the classification of 


non-bipartite triangle-free graphs with largest size from [1]. 


Theorem 5. For n > 6 we have mex(n;2) = aes , with the unique extremal 
graph given by Ty, for odd n and Ty, with one edge added between the partite 
sets for even n. 


1a 


In marked contrast to the quadratic size of extremal graphs with given mono- 
phonic position number, the function gex(n;a) is O(n). A graph with order 
n = 10, general position number 3 and largest size is shown in Fig. 2. This can 
be shown by a simple upper bound on the maximum degree of such a graph that 
comes from Ramsey theory. The Ramsey number R(s,t) is the smallest value 
of n such that any graph with order n contains either a clique of size s or an 
independent set of size t; taking the converse of the extremal graphs we trivially 
have the symmetry R(s,t) = R(t, s). 


Theorem 6. For a> 3 the function gex(n;a) is bounded above in terms of the 
Ramsey number R(a,a+1) by gex(n;a) < Raat) py, 

Proof. Let G be a graph with order n, gp-number a, size gex(n; a) and maximum 
degree A. Suppose that there exists a vertex x with degree d(x) > R(a,a+ 1) 
and let X be the subgraph induced by N(x). Then X contains either a clique of 
order a, which together with x would give a clique of size a+ 1, or else X has an 
independent set of size a+1; either of these sets constitutes a general position set 
with more than a vertices. Thus A < R(a,a+1)—1 and gex(n; a) < Rlaat it y, 
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Fig. 2. A graph with order 10, gp-number 3 and largest size 


Another interesting question is to find the smallest possible size of a graph with 
order n, mp-number a and gp-number b; we denote this number by ex™ (n; a; b). It 
follows from the results of Sect. 2 that this number exists for a < b and sufficiently 
large n. Again we conjecture the following constructions to be extremal. 


Theorem 7. If a and b have the same parity, then for n > ob 3a +4 we have 
ex (n;a;b) < n+ 2441, Ifa and b have opposite parities, then forn > obo Sat Lh 
we have ex~(n;a;b) < n+ 2443. 


Proof. For r > 2 and t > 0 we define a graph S(r,t) as follows. Take a cycle 
Csr41 of length 5r +1 and identify its vertex set with Zs5,+41 in the natural way. 
Join the vertex 0 to the vertices 3+ 5s for all se Nin the range0O<s<r-—-1. 
Finally append a set W = {w1,...,w;} of t pendant vertices to the vertex 0. An 
example is shown in Fig.3. We claim that S(r,t) has mp(S(r,t)) = ¢+2 and 
ep(S(r,t)) = 2r +t. 

The set W U {1, —1} is obviously in monophonic position, so mp(S(r,t)) > 
t+ 2. Let M be any optimal mp-set of S(r,t). By a result of [15] on triangle-free 
graphs any set in monophonic position in S(r, t) is an independent set. The path 
1,2,3,...,57 — 1,5r in Cs5,41 is monophonic in S(r,t) and hence contains at 
most two points of M, so if the mp-number of S(r,t) is any greater than t + 2, 
then M contains three vertices of C5;41, one of which is 0, so that MNW =@6 
and t = 0. A simple argument shows that the mp-number of S(r,0) is two. Thus 
mp(S(r,t)) =t+ 2. 

Consider now the set {2+ 58,4+ 58:0 <s <r—1}UW. The vertices 
of this set are at distance at most four from each other and it is easily verified 
that none of the geodesics between them pass through other vertices of the set. 
Thus gp(S(r,t)) > 2r+t. Let K be a gp-set of S(r,t) that contains > 2r+t+1 
vertices. For 0 < s < r—1 the set S[s] = {1+ 5s,2+ 55,34 5s,4+5s,5+ 5s} 
on C5,+41 contains at most two vertices of K. It follows that kK must contain the 
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vertex 0, two vertices in each of the aforementioned sets and every vertex of W. 
As the vertices of W have shortest paths to the vertices of K in S[0], we must 
have t = 0. For r > 3, if0 <s<s' <r—lands+2 < s’, then vertices in 
S|s] have shortest paths to the vertices in S[s’] passing through 0, so 0 ¢ K and 
ep(S(r, t)) = 2r + t. Also gp(.$(2,0)) = 4. If a and b have the same parity and 
b > athe graph $ (54 +1,a— 2) therefore has the required parameters. It is 
shown in [15] that adding a pendant vertex to an extreme vertex (i.e. a vertex 
with neighbourhood that induces a clique) preserves the mp-number. As this 
also holds for the gp-number if a > 3 we can add a path to a vertex of W to give 
a graph with any larger order n’ > n and the same mp- and gp-numbers. If a = 2 
then lengthening one of the sections of length five on C5,41 accomplishes the 
same aim. If a and b have opposite parities, then shortening one of the sections 
of length five on C5,41 yields a graph with order n = 5r+t, sizem=n-+r, 
mp-number a = t+ 2 and gp-number b = 2r+t-—1; solving for r and ¢t and 
substituting yields the desired bounds. 


Fig. 3. $(3,3) (left) with an optimal gp-set in black (right) 


4 The Diameters of Graphs with Given Order 
and mp-Number 


The possible diameters of graphs with given geodetic number or hull number 
are Classified in [5,6]. This raises the following question: what are the possible 
diameters of a graph with given order n and monophonic position number &? In 
particular, what are the largest and smallest possible diameters of such a graph? 
We now solve this problem. We split our analysis into two parts, beginning with 
mp-numbers k > 3. 


Theorem 8. For any integers k and n with 3 < k < n—1, there exists a 
connected graph G with order n, monophonic position number k and diameter 
D if and only if2< D<n-—k+1. 


Proof. It was shown in [15] that the monophonic position number of a graph 
with order n and longest monophonic path with length L is bounded above by 
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n—L-+1; rearranging, it follows that LD <n—k+1. As any geodesic in G is 
induced, it follows that n — k + 1 is the largest possible diameter of a graph G 
with order n and mp(G) = k. Therefore it remains only to show existence of 
the required graphs for the remaining values of the parameters. For k = n— 1 
and D = 2 this follows easily by considering the star graph Ky,,-1, so we can 
assume that k <n— 2. 

For n > 2, Theorem 1 shows that any caterpillar graph formed by adding 
k — 2 leaves to the internal vertices of a path of length n — k +1 has order n, 
mp-number k and diameter D=n—k+1. For k > 3 we can construct a graph 
F(n,k,n —k) with order n, mp-number & and diameter D = n — k as follows. 
Take a path P of length n — k; let V(P) = {uo,w1,...,Un—z}, where wu; ~ Uj41 
for 0 <i <n-—k-—1z. Introduce a set Q of k — 1 new vertices v1, v2,...,UR—1 
and join each of them to ug and u,;. A straightforward argument shows that this 
graph has the required parameters. 


Fig. 4. F(16, 6,7) 


Finally for 2 << D<n—k-—1 we define the graph Fn, k, D) as follows. Take 


a path P of length D — 2 with vertices {%o,21,...,%p—2}, where 2; ~ 2j41 for 
0<i< D-—3. Let C, be a cycle with length s = n- D—k+3 and vertex 
set {up,U1,---,Us—1}, where u; ~ ujs, for 0 <i < s—1 and addition is carried 


out modulo s. Join ag to every vertex of Cy, so that C, U {xo} induces a wheel. 
Finally append a set Q = {v1,v2,...,Uk—2} of k — 2 pendant edges to rp_2. 
An example of this construction is given in Fig. 4. This graph has order n and 
diameter D. It is simple to verify that the set QU {uo, ui} is an mp-set and that 
this is largest possible, so that mp(F(n,k, D)) =k. 


It is more challenging to determine the possible diameters of graphs with mp- 
number two. 


Theorem 9. There exists a graph with order n, monophonic position number 
k = 2 and diameter D > 3 if and only if D=n—1 or3< D<|¥FI. 


Proof. For n > 2 the path with length n—1 is a graph with order n, mp-number 
k = 2 and diameter n—1. Theorem 4 shows that for n > 3 the complete bipartite 
graph Ay2),|2; minus a matching of size [5] has mp-number k = 2; this graph 
has diameter D = 3. The existence of graphs with other values of the diameter 
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in the claimed range follow by altering the number of spokes in the ‘half-wheel 
graphs’ defined in [15]. 

We now show that there is no graph with order n, mp-number & = 2 and 
diameter D, where || <.D <n-—1. Suppose that G is such a graph and let 
u and v be vertices at distance D. Assume that G is 2-connected. Then u and 
v are joined by internally disjoint paths of length > D, so that n > 2D > n, 
which is impossible. Hence G contains a cut-vertex w. Suppose that d(w) > 3. 
Then choose a set M of three neighbours of w such that the vertices of M are 
not all contained in the same component of G — w; it is easily seen that M is 
an mp-set. Hence we must have d(w) = 2. Each neighbour of w is either a leaf 
of G or a cut-vertex, so repeating this reasoning shows that G is a path, which 
contradicts D<n-—1. 


We now introduce a special graph operation to clarify when there exists a graph 
with diameter two and mp-number two. 


Lemma 7. If there exists a graph with order n > 4, monophonic position num- 
ber k = 2 and diameter D = 2, then there exist graphs with monophonic position 
number k = 2, diameter D = 2 and orders 3n, 3n+1 and 3n+2. 


Proof. Let H be a graph with order n > 4, monophonic position number k = 2 
and diameter D = 2. Label the vertices of H as hy, ho,..., hy. We will construct 
new graphs with orders 3n, 8n +1 and 3n+ 2 with mp-number k = 2 and 
diameter D = 2 from H as follows. 

First we define the graph G(H) with order 3n + 2. Let X = {21,22,...,¢n} 
and Y = {y1,y2,---,Yn} be two new sets of vertices disjoint from V(H). On 
X UY draw a complete bipartite graph with partite sets X and Y and then 
delete the perfect matching x;y;, 1 <71<n. For 1 <i<n join both x; and y; 
to the vertex h; by an edge. Finally add two new vertices z; and 22, join z, to 
each vertex of X by an edge, join z2 to each vertex of Y and lastly add the edge 
z1Z2 between the two new vertices. An example of this construction for H = C4 
is displayed in Fig. 5. It is easily seen that G has diameter D = 2. To show that 
the mp-number of G is k = 2 it is sufficient to show that for any set M of three 
vertices of G there is an induced path containing each vertex of M. It is evident 
that for any vertex v € V(G) — {21, z2} there is an induced path in G containing 
21,22 and v, so we can assume that |M 2 {21, z2}| < 1. 

The map fixing every element of H, interchanging x; and y; for 1 <i<n 
and swapping z, and z is an automorphism of G, which reduces the number 
of cases that we need to check. Suppose that M is a set of three vertices of 
G(#H) containing one of 21, 22; say z1 € M. Without loss of generality we have 
the following nine possibilities for M’ = M — {z1}: i) M’ = {21,22}, ii) M’ = 
{xi, hi}, iii) M’ = {a1,ho}, iv) M’ = {a1,y1}, v) M’! = {21,2}, vi) M’ = 
{hi, ha}, vii) M’ = {hy, yi}, viii) M’ = {hy, yo} and ix) M’ = {y1, yo}. 

Consider the following two cycles. For 1 < i,7 <n, where i 4 J, we define 
C(i,j) to be the cycle 21, 2;,y;,h;, v7, 21 and, if P is a shortest path in H from 
h; to hj, then D(i, 7) is the cycle formed from the path P from h, to h,, followed 
by the path hj, xj, 21, 7;, hi. Both of these cycles are induced and so can contain 
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at most two points of M. By varying the parameters 7 and 7 we see that the first 
seven configurations for M’ above are not possible. 

For viii) let P be a shortest path in H from h, to hg. By assumption n > 4, 
so as P has length at most two, there exists a vertex of H, say h3, not appearing 
in P. Then the path P, followed by the path ho, yo, x3, z, contains all three 
vertices of M. Finally for case ix) the induced path yj, x2, 21,21, y2 contains all 
three vertices of {21, y1, y2}. 

We can now suppose that MN {z1, 22} = 9. Observe that the subgraph of G 
induced by X UY is isomorphic to the graph T>,, 5 from the proof of Theorem 
4, so we can also assume that M Z X UY. Furthermore, as mp(H) = 2, we 
can take M Z V(#). For all 1 < i,j <n and i ¥ j there is an induced cycle 
Xi, hi, Yi, %j,h;,yj;,%i, So M must contain vertices with at least three different 
subscripts 7 € {1,...,n}. Therefore without loss of generality we are left with 
the following three cases: i) M = {x ,%2,h3}, ii) M = {21,y2,h3} and iii) 
M= {x1, ha, h3}. 

For cases i) and ii), let P’ be the shortest path in H from hg to {h1, ho}; 
without loss of generality P’ is a hg, h3-path that does not pass through h;. Then 
in case i) £1, 21, 22, he followed by P’ contains all three points of M = {x, x2, hs} 
and in case ii) the path 21, y2,h2 followed by P’ contains all three vertices of 
M = {21, hs, yo}. For case iii), if P’ is a shortest hg, h3-path in H, then 21, yo, ha 
followed by P’ contains all three vertices of M = {x , hz, h3} unless d(h2, h3) = 2 
and P’ is the path he, hi, h3, in which case the path hs, y3, 21, y2, he suffices. 

This analysis also shows that the graphs G’(H) = G(H) — {za} with order 
3n+ 1 and the graph G’(H) = G(#) — {21, z2} with order 3n also have mp- 
number k = 2 and diameter D = 2. Hence for any n > 4 if there is a graph with 
order n, mp-number k = 2 and diameter D = 2 there also exists such a graph 
for orders 3n, 3n+ 1 and 3n +2. 


Fig. 5. The graph G(C1) 
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Theorem 10. There is a graph with order n, monophonic position number k = 
2 and diameter D = 2 if and only ifn € {3,4,5,8} orn > 11. 


Proof. The statement of the theorem has been verified by computer search for 
all n < 32 [11]. Let n > 33 and assume that the result is true for all orders < n. 
Write n = 3r +s, where s is the remainder on division of n by 3. Then r > 11 
and by the induction hypothesis there exists a graph with order r, mp-number 
k = 2 and diameter D = 2. Then by Lemma 7 there exists a graph with order 
mn, mp-number k = 2 and diameter D = 2. The theorem follows by induction. 


Acknowledgements. The authors are grateful to the two anonymous reviewers for 
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Abstract. Comparability graphs and cover-incomparability graphs (C-I 
graphs) are two interesting classes of graphs from posets. Comparability 
graph of a poset P = (V,<) is a graph with vertex set V and two 
vertices u and v are adjacent in V if wu and v are comparable in P. A 
C-I graph is a graph from P with vertex set V, and the edge-set is the 
union of edge sets of the cover graph and the incomparability graph of 
the poset. C-I graphs have interesting implications on both graphs and 
posets. In this paper, the C-I graphs, which are also comparability graphs 
are studied. We identify the class of comparability C-I graphs, which are 
Ptolemaic graphs, cographs, chordal cographs, distance-hereditary and 
bisplit graphs. We also determine the posets of these C-I graphs. 


Keywords: Comparability graphs - Cover-incomparability graphs - 
Ptolemaic graph - Cographs - Distance-hereditary graphs - Bisplit 
graphs 


1 Introduction 


Comparability graphs form a well-studied class of graphs from posets having 
many applications and algorithmic interest. Several characterizations are avail- 
able for comparability graphs [3]. One of them is a characterization in terms 
of forbidden subgraphs by Gallai in his classic paper [9]. Comparability graphs 
are also termed as transitively orientable graphs, partially orderable graphs, and 
containment graphs of the family of sets [3]. Comparability graphs can be rec- 
ognized in polynomial time [14]. Another important graph from posets is the 
cover graph, which is the abstract undirected graph behind the Hasse diagram 
of the poset. It is well known that recognition complexity of a cover graph is 
NP-complete (NeSetiil and Rédl [15], and Brightwell [7]). 
Cover-incomparability graphs of posets, or shortly C-I graphs, were intro- 
duced in [4] as underlying graphs of the standard transit function on posets. 
The C-I graphs are precisely the graphs whose edge set is the union of edge sets 
of the cover graph and the incomparability graph (complement of a compara- 
bility graph) of a poset. Like the cover graph, the recognition complexity of a 
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C-I graph is also NP-complete (Maxova et al. [13]) in contrast to the compa- 
rability graphs. Hence the problem of characterizing well-known graph families 
whose C-I graphs have polynomial recognition complexity is interesting. Such 
C-I graphs studied include the family of split graphs, block graphs [5], cographs 
[6], Ptolemaic graphs [11], distance-hereditary graphs [11] and k-trees [12]. C-I 
graphs were recently characterized among the planar graphs and chordal graphs 
along with new characterizations of Ptolemaic graphs, respectively in [2] and [1]. 
It is also interesting to note that every C-I graph has a Ptolemaic C-I graph as 
a spanning subgraph [2]. 

It is trivial to note that the C-I graphs, which are cover graphs, are precisely 
the paths, but the same problem for comparability graphs is nontrivial and 
exciting. This paper identifies the C-I graphs that are comparability graphs 
among the Ptolemaic graphs, cographs, distance-hereditary graphs and bisplit 
graphs. Of these graphs, bisplit graphs and cographs are comparability graphs 
in general. We observe that the class of C-I graphs, which are Ptolemaic graphs 
and distance-hereditary graphs are comparability graphs. If G is the class of C-I 
graphs that are also comparability graphs, then for a graph G € G, there exists 
two different partial orders on the vertex set of G’, one gives rise to the C-I graph 
and the other to the comparability graph and both graphs being isomorphic to 
G. We address this problem also for the families of graphs as mentioned above 
and determine the posets. 

We organize the paper as follows. In the rest of this introductory section, 
we fix the terminology and notations and discuss some preliminary results on 
C-I graphs. In Sect. 2, we characterize the posets and graphs whose compara- 
bility graphs are Ptolemaic C-I graphs. In Sect.3, we characterize the posets 
and graphs whose comparability graphs are cographs and distance hereditary 
graphs. Similarly we do the same for bisplit graphs in Sect. 4. Finally, in Sect. 5, 
we study the composition of C-I graphs and observe that the composition of C-I 
graphs need not be a C-I graph in general and determine some cases when the 
composition of graphs is a C-I graph as well as a comparability graph. 

A partially ordered set or poset P = (V,<) consists of a nonempty set V and 
a reflexive, anti-symmetric, transitive relation < on V, denoted as P = (V,<), 
we call u € V an element of P. If u < v or v < u in P, we say u and v are 
comparable, otherwise incomparable. If u << v but u # v, then we write u < v. 
If wu and v are in V, then v covers u in P if u < v and there is no w in V 
with u < w < v, denoted by u<dv. We write u< <u if u < v but not udu. 
By u||v, we mean that u and v are incomparable elements of P. A poset P is 
pictured as a Hasse diagram consisting of elements of P and the covering relation 
between elements denoted as line segments in upward orientation. Let V’ C V 
and Q = (V’,<’) be a poset, Q is called a subposet of P, if u <' v if and only if 
u <v, for any u,v € V’. The subposet Q = (V’,<) is a chain (antichain) in P, 
if every pair of elements from V’ is comparable (incomparable) in P. A chain of 
maximum cardinality is named as the height of P denoted as h(P). An element 
u in P is a minimal (maximal) if there is no x € V such that x < u(x > wu) in 
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P. A poset P is dual to a poset Q if for any x,y € P the following holds: x < y 
in P if and only ify <a inQ. 

A finite ranked poset (also known as graded poset [8]) is a poset P = (V,<) 
that is equipped with a rank function p: V — Z satisfying: 


— p has value 0 on all minimal elements of P, and 
— p preserves covering relations: if a <b then p(b) = p(a) +1. 


A ranked poset P is said to be complete if for every i, every element of rank 7 
covers all the elements of rank i — 1. For a completely ranked poset P = (V,<) 
we say that element v € V is on height i, if p(v) = i—1. We refer [8], for notions 
of posets. 

Let G = (V, E) be a connected graph, vertex set and edge set of G denoted as 
V(G) and E(G) respectively, the complement of G is denoted as G. A graph H is 
said to be a subgraph of G if V(H) C V(G) and E(#H) C E(G). 7 is an induced 
subgraph of G if for u,v € V(H) and uv € E(G) implies uv € E(H). A graph G 
is said to be H-free, if G has no induced subgraph isomorphic to H. A complete 
graph is a graph whose vertices are pairwise adjacent, denoted as K,,, a set 
S C V(G) is a clique if the subgraph of G induced by S' is a complete graph, and 
a maximal clique is a clique which is not contained by any other clique. A vertex 
v is called simplicial vertex if its neighborhood induces a complete subgraph. An 
independent set in a graph is a set of pairwise nonadjacent vertices. The 3-fan 
is the graph that consists of a path on 4 vertices and a vertex adjacent to all 
vertices of the path. The distance between u and v in G is the length (i.e. the 
number of edges) of the shortest path from u to v in G. The diameter of G, 
diam(G), is defined as the maximum distance over all the pair of vertices in G. 
A graph G is bipartite if V(G) is the union of two disjoint independent sets called 
partite sets of G. A complete bipartite graph or biclique is a bipartite graph such 
that two vertices are adjacent if and only if they are in different partite sets. If 
graphs G; and Gp» have disjoint vertex set Vj and V2 and edge set E, and EF» 
respectively, then their union G = G; UG2 has V = V, UVg and EF = FU Ey 
and their join, denoted by G1 V Go, consists of G = G1 UG and all edges joining 
V, with V2. 

A graph G is chordal if it contains no induced cycles of length more than 
3, it is distance-hereditary if every induced path is also a shortest path in G. 
A graph G is Ptolemaic if it is distance-hereditary and chordal. Equivalently, 
G is Ptolemaic if and only if it is 3-fan free chordal graph. P4- free graphs are 
called cographs. A graph G that is both chordal and cograph is called chordal 
cograph. A graph G is the comparability graph of a poset P denoted as Cp, if two 
vertices are adjacent in C’p if and only if they are comparable in P. Finally the 
cover-incomparability graph of a poset P = (V,<) denoted as Gp is the graph 
G = (V,E), where uv € E(G), if either u<dv or v du or ullv in P. A graph is 
C-I (comparability) graph if it is the C-I (comparability) graph of some poset P. 

Now we recall some basic properties of posets and their C-I graphs. 


Lemma 1. [4/ Let P be a poset. Then 
(i) the C-I graph of P is connected; 
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3-fan net N 


Fig. 1. 3-fan, net graph and the poset N 


(tt) points of P that are independent in the C-I graph of P lie on a common 
chain; 

(ttt) an antichain of P corresponds to a complete subgraph in the C-I graph of P; 

(iv) the C-I graph of P contains no induced cycles of length greater than 4. 


In the following sections, we discuss some families of graphs, which are compa- 
rability graphs, as well as C-I graphs and describe the family of posets whose 
comparability graphs are precisely the graphs in the family. We begin with the 
class of Ptolemaic graphs. 


2 Ptolemaic Graph 


It is proved in [2] that a graph G is a Ptolemaic C-I graph if and only if G 
is the C-I graph of a completely ranked poset P. In this section, we construct 
the family of posets whose comparability graphs are Ptolemaic C-I graphs, and 
prove that every Ptolemaic C-I graph is a comparability graph. 

Define a family of posets Y as follows. Consider a sequence of disjoint chains 
denoted by Ly, L3,..., Loay1- 

Where L; consists of elements {aj1, @i2,-..,@ik;,@(i41)15 +++) @(it1)kigi f> for 
1=1,3,...2d+1,d > 0. Make a covering relation between a;z, in Ly and aj—1)1 
in Lj_2; that is, ai, <1 aq_1)1, for i= 3,5,...,2d +1. (see Fig. 2) 


Theorem 1. A comparability graph Cp of a poset P is a Ptolemaic C-I graph 
if and only if P © #. Moreover, every Ptolemaic C-I graph is a comparability 
graph. 


Proof. Let G be a comparability graph of some poset P. That is, G = Cp. 

Suppose that G is also a Ptolemaic C-I graph of a poset, say P’. Then it 
follows that P’ is a completely ranked poset. Let the elements of rank i be 
denoted as Cj, 1. Now every element of Cj, ; covers every element of C;. That 
is, the sets C; and C; U Cj4,1 form cliques covering all the edges in G and that 
V(G) = Ci UC, UC3---UCh, for some h > 0. Also an element x is adjacent 
to y in G if and only if z,y € C; UCj41 for some 7 € {1,2,...h—1}. Let Ci = 
{ai1, @i2,...,@jx,} for i = 1,2,...,h. The adjacent vertices in a comparability 
graph must lie on the same chain and non adjacent vertices lie on different chains 
in P. So the elements in C; and Cj, lie on a chain and the elements in C; and 
C; lie on different chains for 7 = 1,2,...,h—1 and j 4i—1,i+1. This implies 
that Cj; UCi41 = Lj, for i = 1,3,...2d+1, where 2d+1 is h or h—1 according 
to h is odd and even respectively. That is, we have proved that P€ FY. 
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Fig. 2. Gp, be the general form of a family of the Ptolemaic C-I graph and Y be the 
corresponding family of poset whose comparability graphs are Ptolemaic C-I graphs 
Gp: Py 


Conversely, suppose that P € Y. That is, the elements of P form disjoint 
chains L; with the covering relation defined between elements a;,, in LD; and 
Qi—1k;_, m L;-2. Now partition the elements of L; as C; = {ai1, ai2,..., dix, } 
for i = 1,2,...,h. In Cp, the elements in C; and C; U Ci41 form cliques for 
i=1,2,...,h. Also no pair of elements in C; and C;, for |i—j| > 1 are adjacent 
in Cp. Let P’ be the poset defined such that every element of Cj11 cover every 
element of C;, i =1,...,h. Now P’ is a completely ranked poset and hence Gp: 
is a Ptolemaic C-I graph. It is clear that Gp, = Cp. That is, we have proved that 
corresponding to every completely ranked poset P’, there exists a poset Pe F 
and conversely for every P € #, there exists a completely ranked poset P’ such 
that Cp = Gp, which completes the proof. 


3 Cograph and Distance-Hereditary Graph 


In this section, we prove that C-I graphs among distance-hereditary graphs are 
comparability graphs. We first prove that the C-I graph, which is a distance- 
hereditary graph, is either a Ptolemaic graph or a cograph. We determine the 
posets whose comparability graphs are distance-hereditary C-I graphs. 

Distance-hereditary C-I graphs have been studied in [11]. We quote the fol- 
lowing result from [6] and [11]. 


Theorem 2. /6/ Let G be a chordal cograph. Then G is a C-I graph if and only 
if G is a connected graph that contains at most two maximal cliques. 
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From Theorem 2, a graph G is chordal, C-I graph and cograph (called chordal 
C-I cograph) if and only if there exists three pairwise disjoint set C1, C2 and C3 
such that V(G) = C,UC2UCs, and x,y € V(G) are adjacent in G if and only if 
LYE C, UC, or Lye C2 U C3. That is, Ch, Co, C3, Cy, UC, and C2 U C3 forms 
cliques and the graph has no other edges. It is clear that a chordal C-I cograph is 
a Ptolemaic C-I graph as it is C-I graph of a completely ranked poset of height 3. 
From this observation and the posets Y from Sect. 2, we can describe the family 
of posets whose comparability graphs are chordal C-I cographs. The figure is 
depicted in Fig. 3(b). 


C2 « 


Cs \ 
O- m TE G 
'Q@2¢ Cae! 


Fig. 3. (a) General form of a chordal C-I cograph, where C1, C2,C3,Ci U C2 and 
C2U C3 form cliques. (b) Corresponding poset whose comparability graph is a chordal 
C-I cograph. 


Now from the fact that a C-I cograph is the join of chordal C-I cographs 
(which is proved in [6]) and from the family of posets in Fig. 3, we can describe 
the family of posets whose comparability graphs are also C-I graphs as well as 
cographs. It is obtained by the ordinal sum of the poset or its dual as shown in 
Fig. 3(b). (The ordinal sum Z = P ® R of two disjoint posets P and R is the 
poset with the underlying set as union of the underlying sets of P and R and 
the Hasse diagram of Z is obtained by placing the Hasse diagram of R above the 
Hasse diagram of P and joined by a covering relation from the maximal elements 
of P to all the minimal elements of R). In Fig. 4(a), one such family of posets is 
described. 

It is known that, in general, a cograph is a comparability graph of a series- 
parallel partial order. (A poset P = (V,<) is series parallel if and only if the 
poset N (shown in Fig. 1) is not a subposet of P.) Now the C-I cographs are the 
join of chordal C-I cographs whose posets are a particular class of series-parallel 
partial orders. 


Theorem 3. A graph G is both a C-I graph and a cograph if and only if G is 
a vertex disjoint union of b complete bipartite subgraphs and i isolated vertices 
andi> b. 
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‘v(Gs) 


Fig. 4. G be the C-I cograph, join of chordal C-I cographs G1, G2,...,Gn and (a) be 
a poset whose comparability graph is G. 


Proof. If G is a complete graph, then the result follows trivially. Now let G is 
not complete. 

Suppose G is a C-I cograph. That is, it is the join of chordal C-I cographs. 
Let G= Gi V GoV...V Gn, where G),G2,...,G, are chordal C-I cographs. 
For each Gi, the vertex set consist of pairwise disjoint sets C*, Ck and C¥ such 
that V(G;) = Cf UCK U C8 and that Cf, Ce, CS, CP UC# and Ck UC forms 
cliques and covers all the edges of Gz. If some G’s are complete graphs, then 
we can take Cf and C¥ as empty sets (see the Fig. 3(a)). Now consider G. In 
G, the vertices Cf UC§ induces complete bipartite subgraphs for k = 1,2,...,n 

n n 
and (J C§ induces isolated vertices. Consider i =| (J) C¥], then clearly i > n. 
k=1 k=1 

Conversely, let G be a graph such that G is a vertex disjoint union of b 
complete bipartite subgraphs and i isolated vertices, i > b. Let Gy,Go,...,Gy 
be complete bipartite subgraphs of G and for each G;, let the vertex set V(G;) 
be Cj U C3 and since every G; is a complete bipartite graph, every vertex in 
Ci adjacent to every vertex in CJ, the vertices in C/ and C3 are independent 
for j = 1,2,...,. Since there are 7 isolated vertices in G with i > b, we can 
partition the 7 isolated vertices into b sets, labeled as CZ for j = 1,2,...,b. Now, 
we consider the graph G and construct an induced subgraph of G' as follows. Let 
G; be the induced subgraph of G with the vertex set V(G;) = Ci UC} UC} 
and by the definition of C?, i = 1,2,3 and their adjacency relations in G, it 
follows that C{,C3,C%,C{ UC and C} UC form cliques covering all the edges 
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in G;, for j =1,2,...,b. That is, G; is a chordal C-I cograph. Also every vertex 
of G; is adjacent to every vertex of G, for k = 1,2,...,b, 7 A k in G. Since 
V(G) = V(G,) UV(G_) U... UV(G), G is the join of chordal C-I cographs or 
G is C-I cograph. Hence the Theorem. 


Theorem 4. /11] Let G be a distance-hereditary graph. G is a C-I graph if and 
only if one of the following two conditions holds: 


(1) diam(G) = 2 and G is a verter disjoint union of b complete bipartite sub- 
graphs and i isolated vertices and i > b, or 

(2) diam(G) > 3, G is chordal, and G does not contain a triple of independent 
simplicial vertices (That is, G is Ptolemaic). 


From Theorems 3 and 4 we have, 
Theorem 5. A graph G is a distance-hereditary C-I graph if and only if either 


(i) G is a Ptolemaic C-I graph if diam(G) > 2, Or, 
(ti) G is a C-I cograph if diam(G) = 2. 


In general, a distance-hereditary graph need not be a comparability graph. 
For example, it can be verified easily that the net graph (shown in Fig. 1) is a 
distance-hereditary graph, but it is not a comparability graph. From Theorem 5, 
it follows that distance-hereditary C-I graph is either a C-I cograph or a Ptole- 
maic C-I graph. Hence distance-hereditary C-I graphs are comparability graphs. 
The family of the posets, whose comparability graphs are distance-hereditary 
graphs are either of the form # in Fig.2 or the ordinal sum of posets or the 
dual posets in Fig. 3(b). In Fig. 4(a), one such family of posets is described. 


4 Bisplit Graphs 


In this section, we identify C-I graphs that are bisplit graphs, and prove that 
bisplit C-I graphs are comparability graphs. 

A graph G is a bisplit graph if its vertex set can be partitioned into three 
independent sets X,Y and Z such that Y U Z induces a complete bipartite 
subgraph (bi-clique) in G. That is, a graph is bisplit if and only if it can be 
partitioned into an independent set and a bi-clique. 

It is trivial to observe that the path graphs are bipartite. Now suppose a C-I 
graph Gp of a poset P is a bipartite graph. That is, the vertex set V (|V| = n) can 
be partitioned into two independent sets, say {u1, U2, ..., ur} and {v1, v2,..., Us}, 
where n = r+s. Since uy, U2, ..., Up—1, and u, are independent, they lie on a chain 
alternately. Also v1, V2, ..-,;Us—1, and v, lie on a chain alternately. Therefore, the 
poset P is a chain, when |V| > 2. When |V| = 2, the poset is a chain of height 
2 or an antichain of size 2. Thus we have the following remark. 


Remark 1. A C-I graph G is bipartite if and only if it is a path. Hence a C-I 
graph is a complete bipartite graph if and only if G is either the path graph P,, 
Py or Ps. 
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Fig. 5. Family of bisplit C-I graphs (c_r) 


The family of graphs denoted by ®c_; are depicted in Fig. 5. 


Theorem 6. A graph G is a bisplit C-I graph if and only if G is from g_; (in 
Fig. 5). 


Proof. Let G be a bisplit C-I graph. So the vertices of G can be partitioned into 
three independent set X,Y and Z such that X UY induced a bi-clique in G. 
Since G is a C-I graph, independent elements lie on a chain non consecutively. 
X UY induces a bi-clique, so every element of X is adjacent to every element of 
Y. That is, either of the following will occur. 


(i) Every element in X is incomparable with every element of Y. 
(ii) The maximal element in X covers the minimal element of Y. 
(iii) The minimal element of X covers the maximal element of Y. 
(iv) Satisfies both conditions (ii) and (iii). 


It may be noted that other elements in X and Y are incomparable. Let C; be the 
chain containing the elements in X and C2 be the chain containing the elements 
in Y. It is further noted that the elements in X, Y cannot occur consecutively in 
C1, respectively Cp. Since X,Y,Z form a partition of V, the remaining elements 
of C; and C2 must be from Z. Since the elements in Z are also independent, the 
elements in Z should also lie on a chain. This will happen only if either |X| < 1 
or |Y| < 1. Without loss of generality assume |X| < 1. The following cases can 
happen. 


Case 1: |X| = 0, |Y| =0 then Z = 1. In this case, G is the single vertex graph. 
Case 2: |X|=0, |Y| =n, n>1 then |Z] =n—-—1,norn+l1. 
In this case G is the path graph P,,, where m = 2n — 1,2n or 2n+ 1. 
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Case 3: |X| =1, |Y| =1, then |Z| =0,1 or 2 
(the case |X| = 1, |Y| = 0 is similar to Case 2, when n = 1). 
Let X = {a} and Y = {y} then 
(i) if |Z| = 0, then the graph G is Ko. 
(ii) if |Z] = 1, let Z = {z}, then the possible posets are isomorphic to the 
posets in Fig. 6 or its duals. 
It is clear that the C-I graph corresponding to the posets in Fig. 6(i) is 
P3 and in all the other cases in Fig. 6, the C-I graph is isomorphic to K3 
(special case of Fig. 5(d)). 


(2) (i) (44) (iv) 


Fig. 6. Posets whose C-I graphs have a bi-clique of size 2 and independent set of size 
1 Ge, |X| = IY] = |Z] = 1) 


(iii) |Z| = 2, let Z = {z,, 22}, then the possible poset are isomorphic to the 
posets in Fig. 7 or its duals. 
The C-I graph corresponding to the posets in Fig. 7(i),(ii) and (iii) are 
isomorphic to 2-fan (Fig. 5(d) is the k-fan), the C-I graph corresponding 
to the poset Fig. 7(iv) is a sub-case of Fig. 5(e) and for the poset Fig. 7(v) 
is a path graph P, (Fig. 5(a)). 


ENE 


(it) (477) (iv) 


Fig. 7. Posets whose C-I graphs have a bi-clique of size 2 and independent set of size 
2 (t.e.,|X| = |Y| = 1 and |Z| = 2). 


Case 4: |X| =1, |Y| = 2, then |Z| =0,1,2 or 3. 
(i) |Z| = 0, then the graph is just a P3. 
(ii) |Z| = 1, let Z = {z}, then the possible posets are isomorphic to the 
posets in Fig. 8 or its duals and they are precisely, the posets isomorphic 
to those of Case 3(iii). 
(iii) |Z| = 2, Let Z = {z1, z2} , then the possible poset are isomorphic to the 
posets in Fig. 9 or its duals. 
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Y2 Y2 Y2 Zz Y2 Y2 
Y1 YI YU YI Y 
(2) (it) (iit) (iv) (v) 


Fig. 8. Posets whose C-I graphs have a bi-clique of size 3 and independent set of size 
2 (t.e.,|X| = 1,/Y| = 2 and |Z| = 2). 


The C-I graphs corresponding to the posets in Fig. 9 (i) and (ii), respec- 
tively are shown in Fig.5 (a) and (b). The C-I graph corresponding to 
the posets in Fig. 9 (iii) is depicted in Fig.5 (c). The C-I graphs corre- 
sponding to the posets in Fig. 9 (iv) and (v) are isomorphic to the graph 
depicted in Fig.5 (e). For all the other posets in Fig.9, the C-I graphs 
are isomorphic to the graph shown in Fig. 5 (d). 


Y2 Z2 Y2 Y2 Z2 Y2 22 Y2 
a 72 Y2 x a 22 Y2 22 y 2 22 
U1 > x Y1 Y1 x Zy i Yi co 21 Y1 
a Of wn 21 1 y 71 yn Za 
(i) (it) (itt) (iv) (v) (vi) (vii) (vii) 


Fig. 9. Posets whose C-I graphs have a bi-clique of size 3 and independent set of size 
1 (i.e.,|X| = 1,/Y| = 2 and |Z| = 1) 


Case 5: |X| =1, |Y| =n, n > 3 then |Z| = n—1, n or n+1. the possible posets 
are isomorphic to the posets in Fig. 10 or its duals. 
The C-I graphs corresponding to the posets in Fig. 10 (i) (ii) and (iii) 
are isomorphic and is shown in Fig. 5 (d). The C-I graphs corresponding 
to the posets in Fig. 10 (iv) and (v) are the isomorphic and is depicted 
in Fig.5 (e). The C-I graph corresponding to the poset Fig. 10 (vi) is 
shown in Fig.5 (f). 


In all the cases, we have shown that the C-I graphs G which are bisplit graphs 
belong to the family ¢_; and hence the necessary part follows. 

It is easy to verify that the graphs from the family ®¢_; are bisplit C-I 
graphs. Hence the theorem follows. 


The posets whose comparability C-I graphs are bisplit graphs are easily con- 
structed, and for each bisplit graph in Fig. 4, the corresponding posets are respec- 
tively represented in Fig. 11. 
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Fig. 10. Posets whose C-I graphs have a bi-clique of size > 4 (i.e.,|X| = 1, |Y| = n, 
n > 3 and then |Z| = n—1,n orn+1). 
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Fig. 11. The posets whose comparability graphs are bisplit C-I graphs (@c_1) 


5 Composition of C-I Graphs 


It may be noted that the composition of graphs plays a vital role in the theory 
of comparability graphs as noted in the Theorem 7 by Martin Golumbic in [10]. 
We first define the composition of graphs. Let Go be a graph with n ver- 
tices U1,V2,---,Un and let G1, Go,...,G, be n disjoint graphs. The composition 
graph, G = Go[G1, Go,...,G,] is formed as follows: For all 1 < i, 7 <n, replace 
vertex vu; in Go with the graph G; and make each vertex of G; adjacent to each 
vertex of G; whenever v; is adjacent to v; in Go. It may be noted that if Go is 
a complete graph, then the composition becomes the join operation of graphs. 


Theorem 7. /10] Let G = Go[Gi, Go,...,Gn], where G;’s are disjoint graphs. 
Then G is a comparability graph if and only if each G; (0 <i <n) is a compa- 
rability graph. 


This section attempts to study the comparability graphs among C-I graphs 
using the composition operation. 


Theorem 8. If G = Go[Gi, Go,...,Gn] is a C-I graph then Go is a C-I graph. 


Proof. Let G be the composition graph which is also a C-I graph of a poset 
P. Let G = Go[Gi, Go,...,Gn]. If u,v € V(G) then uv € E(G) if and only 
if either u,v € V(G;) with uv € E(G;), or u € V(G;) and v € V(G;) whose 
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corresponding vertex v; for G; and v; for G; in Go are adjacent (i.e., viv; € 
E(Go)). 

Let S = {uj,U2,U3,---,Un} GC V(G), where u; € V(G;) for 1 = 1,2,...,n. 
Let Gg be the subgraph of G induced by S. Then Gg = Go. 

Now we need to prove that G's is a C-I graph. For u,v € V(Gs), depending 
upon whether uv € E(Gg) or uv ¢ E(G;), the following cases can occur. 


Case 1: If uv € E(Gs) then clearly uv € E(G). So either ud v or v duor ullv 
in P. Now the subposet P’ of P consisting of elements u,v € S with 
u<lv or v<luor ullv is a poset whose C-I graph is isomorphic to Gg. 

Case 2: If uv ¢ E(G,), then wv ¢ E(G), that is, either u << <v or v d du in P. 
That is, wu and v lie on some chain, say C in P. If all the elements of 
C belongs to S$, then the poset P’ defined by the same relation in C is 
such that Gp = Gg. If C contains elements w belonging to some G; 
not in S, then replace w by the unique element w’ in G; which is in 
S, thus obtaining a poset P’ consisting of only elements in S using the 
same relation in C. It follows that Gp, = Gg. 


Hence the theorem. 


Theorem 9. Let G = Go[G1,Go2,...,Gn]. If Go is a C-I graph and G; for 
4=1,2,...,n are complete graphs then G is a C-I graph. 


Proof. Let G = Go[G1, Go, ...,G,] be the composition of graphs G1, Go,...,Gn- 
By definition of G, the vertices of G consists of vertices in G;, for i =1,...,n. 
Let Go be the C-I graph of a poset Py and G; for 7 = 1,2,...,n be complete 
graphs. Clearly G; is the C-I graph of a poset P; with every element a,b © P; 
being incomparable (a||b). Now we construct a poset P from Pp by replacing the 
element v; in Po by the vertices of G;. Corresponding to vertices a,b € V(G;) 
make a||b in P for i = 1,2,...,n. If vu; dv; in Po, then make every element 
u€ V(G,) and v € V(G;) as uv in P. 

Consider the C-I graph Gp of the poset P. Now we need to prove that 
Gp = G. Now ab € E(Gp) if and only if a<b or ba or al|b in P. This implies 
that a<1b (or b<a) in P if and only if for a € V(G;) and b € V(G;) with v; Jv; 
(or vj; <1 u;)in Po and also a||b in P if and only if either a and b are in V(G;) 
or a € V(G;) and b € V(G;) with v;||v; in Po. That is, either ab € E(G;) or 
a € V(G;) and b € V(G;) such that u,v; € E(Go). Therefore, ab € E(G). Hence 
the result. 


From the above results, we get the following, 


Corollary 1. If G = Go[Gi, G2,...,Gn] is a comparability C-I graph then Go 
is a comparability C-I graph. 


Corollary 2. Let G = Go[G1,Go,...,Gn] and G;’s are complete graphs for 
a = 1,2,...,n. Then G is a comparability C-I graph if and only if Go is a 
comparability C-I graph. 
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Concluding Remarks: We have given some preliminary results of comparabil- 
ity graphs which are also C-I graphs, along with their corresponding posets and 
obtained some families of such graphs. Using Theorem 9, we may obtain several 
new classes of these graphs by considering the known C-I graphs as the graph 
Go. In particular, we can take the known C-I graphs that we have seen in this 
paper as the graph Gp in Theorem 9, and obtain new classes of C-I graphs that 
are comparability graphs. It will be an interesting problem to characterize com- 
parability C-I graphs. We will address the composition of C-I graphs in detail in 
a forthcoming paper. 
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Abstract. For a graph G = (V,£), a subset D of vertex set V, is 
a dominating set of G if every vertex not in D is adjacent to atleast 
one vertex of D. A dominating set D of a graph G with no isolated 
vertices is called a paired dominating set (PD-set), if G[D], the sub- 
graph induced by D in G has a perfect matching. The M1n-PD prob- 
lem requires to compute a PD-set of minimum cardinality. The deci- 
sion version of the MIN-PD problem remains NP-complete even when 
G belongs to restricted graph classes such as bipartite graphs, chordal 
graphs etc. On the positive side, the problem is efficiently solvable for 
many graph classes including intervals graphs, strongly chordal graphs, 
permutation graphs etc. In this paper, we study the complexity of the 
problem in AT-free graphs and planar graph. The class of AT-free graphs 
contains cocomparability graphs, permutation graphs, trapezoid graphs, 
and interval graphs as subclasses. We propose a polynomial-time algo- 
rithm to compute a minimum PD-set in AT-free graphs. In addition, we 
also present a linear-time 2-approximation algorithm for the problem in 
AT-free graphs. Further, we prove that the decision version of the prob- 
lem is NP-complete for planar graphs, which answers an open question 
asked by Lin et al. (in Theor. Comput. Sci., 591(2015) : 99 — 105 and 
Algorithmica, 82(2020) : 2809 — 2840). 


Keywords: Domination - Paired domination - Planar graphs - 
AT-free graphs - Graph algorithms - NP-completeness - Approximation 
algorithm 


1 Introduction 


Let G = (V, FE) be a graph. A vertex v € V is adjacent to another vertex u € V 
if uv is an edge of G. In this case, we say u, a neighbour of v. The set of all 


© Springer Nature Switzerland AG 2022 
N. Balachandran and R. Inkulu (Eds.): CALDAM 2022, LNCS 13179, pp. 65-77, 2022. 
https://doi.org/10.1007/978-3-030-95018-7_6 


66 V. Tripathi et al. 


vertices adjacent to v € V, denoted by Ne(v), is known as open neighbourhood 
of v, whereas the set Ng[v] = Ne(v) U {v} is known as closed neighbourhood of 
vinG. 

Ina graph G = (V, £), a vertex v € V dominates a vertex u € V ifu € Ne|v}. 
A subset D of vertex set V, is a dominating set of G if every vertex of V is 
dominated by at least one vertex of D. The domination number, symbolized as 
4(G), is the minimum cardinality of a dominating set. The concept of domination 
has wide applications and is thoroughly studied by researchers in the literature. 
A survey of the results, both algorithmic as well as combinatorial, on domination 
can be found in [7,8]. Due to several applications in the real world problems, 
numerous variations of domination are introduced by imposing one or more 
additional condition on dominating set. Many of these variations are thoroughly 
studied by researchers in the literature. Total domination is one of the important 
variation of domination. For a graph G = (V, £) without an isolated vertex, a 
total dominating set of G is a subset D of vertex set such that every vertex of 
the graph is adjacent to at least one vertex in D. 

Paired domination is another important variation of domination, introduced 
by Haynes and Slater in [9]. A detailed survey of results on domination problem 
and its variations can also be found in a recent book by Haynes et al. [6]. Given 
a graph G = (V,£) with no isolated vertices, a subset D of vertex set V, is a 
paired dominating set(PD-set) if D is a dominating set and the subgraph induced 
by D in G has a perfect matching. The paired domination number, symbolized 
aS Ypr(G), is the cardinality of a minimum PD-set of G. The MIn-PD problem 
requires to compute a PD-set of a graph G without an isolated vertex. More 
precisely, the MIn-PD problem and its decision version of the same are defined 
as follows: 


MINn-PD problem 


Instance: A graph G with no isolated vertices. 
Solution: A PD-set D. 
Measure: Size of D. 


DECIDE PD-SET problem 


Instance: A graph G and an integer k > 0, satisfying k < |V]. 
Query: Is there is a PD-set D of G, satisfying |D| < k? 


It is shown that the decision version of the problem is NP-complete for 
general graphs [9]. Therefore, complexity of the problem is studied for sev- 
eral restricted graph classes. It is proven that, the decision version of the prob- 
lem is NP-complete when restricted to special graph classes, including bipartite 
graphs [3], perfect elimination bipartite graphs [16], and split graphs [3]. But, 
on the good side, the problem is efficiently solvable in several important graph 
classes, including permutation graphs [12], interval graphs [3], block graphs [3], 
strongly chordal graphs [4], circular-arc graphs [13] and some others. A detailed 
survey of the results on paired domination can be found in [5]. In Fig. 1 we show 
the hierarchy of some important graph classes and the complexity status of the 
DECIDE PD-SET problem in these graph classes. 
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Fig. 1. Complexity status of MIn-PD problem in some well known graph classes. 


The computational complexity of the problem is still unknown in some graph 
classes including planar graphs, AT-free graphs and circle graphs. AT-free graphs 
is introduced by Corneil et al. in [1]. AT-free graph class includes some important 
classes of graphs such as interval graphs, permutation graphs and cocompara- 
bility graphs as subclasses. A minimum dominating and total dominating set of 
an AT-free graph can be computed in polynomial-time, see [11]. In this paper, 
we investigate the computational complexity of the problem on AT-free graph 
and planar graphs. We show that minimum PD-set of an AT-free graph can be 
computed in polynomial-time. In addition, we give an approximation algorithm 
which computes a PD-set of any AT-free graph, within a factor of 2. Lin et al. in 
[13] and [14] asked to determine the complexity of the problem in planar graphs. 
In this paper, we prove that DECIDE PD-sET problem remain NP-complete 
even for planar graphs. The section wise contribution of the paper is outlined as 
follows: 

In Sect. 2, we give insights on some notations and definitions, including prop- 
erties of AT-free graphs. In Sect.3, we prove the existence of a linear-time 2- 
approximation algorithm to compute a PD-set of an AT-free graph. In Sect. 4, 
we design a polynomial time algorithm to compute a minimum cardinality PD- 
set of an AT-free graph. In Sect.5, we show that the problem remains NP-hard 
for planar graphs. Finally, Sect.6 wind up the paper with some interesting open 
questions on the problem. 


2 Preliminaries 


2.1 Basic Notations and Definitions 


In this paper, we consider only simple, connected and finite graphs with no iso- 
lated vertices. Let G = (V, E) be a graph. The sets V(G) and E(G) represents 
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node(vertex) set and edge set respectively of the graph. When there is no ambi- 
guity regarding graph G, for simplification, we use V and E to denote of V(G) 
and E(G) respectively. For an edge e = uv € EF, u and v are called end vertices 
of e. For any non-empty set A C V, the open neighbourhood of A, symbolized as 
Ne@(A), is given by Ne(A) = U,e4 Na(v) whereas the set Ng[A] = Ng(A) UA 
is known as closed neighbourhood of A. Further, for a set A C V, G\ A represents 
the graph obtained by deleting vertices of set A and all edges having at least 
one end vertex in A, from the graph. In case, A = {u}, we use G \ u, instead of 
using G \ {u}. 

A subset X of vertex set is an independent set if no two vertices of X are 
adjacent in G. A path P in G is a sequence of vertices (41, 22,...,%n) such that 
(ai, Ui41) € E for each i € {1,2,...,n— 1}. For a path P = (21, 22,...,2n41) 
in G, the length of P is |V(P) — 1] =n. Let 2, y € V(G). The distance between 
x and y in the graph G, denoted by dg(z,y), is the length of a shortest path 
between x and y. The diameter of a graph G, denoted by diam(G), is defined as 
diam(G) = max{dg(a,y) | x,y € V(G)}. We use the standard notation [n] to 
denote the set {1,2,...,n}. 


2.2 AT-free Graphs 


Let G = (V,E) be a graph. A set T = {p,q,r} of three vertices, is called an 
asteroidal tripe(in short AT) if T is an independent set and for any two vertices 
in the set T there exits a path P between them such that V(P) does not contain 
any vertex from the closed neighbourhood of third. A graph is AT free if it 
does not contain an asteroidal tripe. A path on six vertices is an example of an 
AT-free graph. 


Definition 1. In a graph G = (V, E), a pair of vertices (x,y) is called a domi- 
nating pair, if the vertex set of any path between x and y in G is a dominating 
set of G. A dominating shortest path is a shortest path connecting x and y inG. 


U1 V2 U3 


Fig. 2. An AT-free graph G 


An asteroidal triple free graph is shown in Fig.2. For the graph G in Fig. 2, 
(v1, V3) is a dominating pair, and P = (v1, v2, v3) is a dominating shortest path. 
We have the following result for a connected AT-free graph in the literature. 
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Theorem 1. /1,2/ A dominating pair exists in every AT-free graph which can 
be computed in linear time. 


3 Approximation Algorithm 


In this section, we show that a PD-set of an AT-free graph G, can be computed 
in linear time whose cardinality is at most twice of yp,(G). Let G is an AT-free 
graph. Using Theorem 1, we note that there exists a dominating pair (x,y) in 
G. Assume that P is a dominating shortest path between x and y in G, and 
the number of vertices in P are t. Note that any vertex that is not in P is 
adjacent to some vertex of V(P), as the set V(P) is a dominating set of G. We 
may also conclude that any vertex not in P has at most three neighbours in P, 
since otherwise P will not be a shortest path. By a similar argument we note 
that any two adjacent vertices in G dominate at most the vertices of a P, in P. 
Consequently, 3" > [4], that is, yp, > 2- [4]. Before proving the Theorem 2, 
which is the main result of this section, we notice that the following lemma is 
true. 


Lemma 1. For any odd positive integer n, [%] > “4+. 


Proof. The proof is easy, and hence is omitted. 


Theorem 2. Given an AT-free graph G, a PD-set D of G can be computed in 
linear time, satisfying |D| < 2+ pr(G). 


Proof. Given an AT-free graph G = (V,£), there is a linear-time algorithm 
to find a dominating pair (z,y) of G (by Theorem 1). Let P = (a = 
U1, Ug... U¢-1, Ut = y) be a shortest path between x and y, and D = V(P). 
We have already observed that 7p,(G) > 2- [4]. We prove the result under the 
following assumptions: 


Case 1: If ¢ is even. 
Here, we note that the set D is a PD-set and |D| = t < 4- [4] <2-p,r(G). 


Case 2: If t is odd. 
In this case, we construct a PD-set of the graph G by adding at most one 
vertex in D. Clearly, D is a dominating set. For pairing, we pair v; with vi+1 for 
i € [t — 2]. Now we need to pair v;. Note that if N(v,) C D then D \ {uy} is a 
PD-set of G, otherwise if there exists a vertex u € N(v;) \ D then the updated 
set D = DU {u} is a PD-set of g. Therefore, we can always construct a PD-set 
D of G, where |D| < t+ 1. Using Lemma 1, we have t+ 1 < 4- [4]. Hence, 
|D| <t+1 < 2%p,(G). 

In both the cases, we can obtain a PD-set D satisfying, |D| < 27p,(G). 
Hence, we have an efficient 2-approximation algorithm to computes a PD-set of 
an AT-free graph. 
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4 Exact Polynomial-Time Algorithm 


The main purpose this section is to establish a polynomial time algorithm that 
outputs a minimum cardinality PD-set, when the input graph is an AT-free 
graph. For this, we first present a theorem, which will be useful in designing 
our algorithm. In this theorem, we show that there exists a BFS-tree T of G 
and a minimum PD-set D of G such that the number of vertices of D in some 
consecutive levels of T’ are bounded. We will use the notation L; to denote the 
vertices, which are at i*” level in the tree T, that is, the set of vertices which are 
at distance 7 from the root node in tree T. The following result is already known 
in literature. 


Theorem 3. [10] Let G be an AT-free graph with dominating pair (x,y) and 
T be a BFS-tree of G rooted at x. Let Lo, 11, Lo,...,1, are the BFS-levels of 
the BFS-tree T. Then there exists a linear-time algorithm which computes a path 
P= ("= 2%0,%1,02,...,La = y) such that x; € L; for eachO <i<d and every 
vertex w € L; fori € {1,2,...,1} ts adjacent to either x;_1 or 2j. 


Theorem 4. Let G = (V, E) be an AT-free graph and (x,y) be a dominating pair 
of G. If Lo, 14, Lo,..., Ly are the BFS-levels of the BFS-tree T rooted at x then 
there exists a minimum cardinality PD-set D, of G such that |DpM Ue Lp| < 
g+4 for alli € {0,1,...0} andj € {0,1,.. (aa 


Proof. Let G = (V,E) be an AT-free graph and D, be a minimum cardinality 
PD-set of the graph G. Suppose that the set D, does not satisfy the given 
property, that is, there is at least one pair (i, 7) such that |D, Aue Ly) > 7 +4 
where i € {0,1,...J} and j € {0,1,...1—i}. Let B= {(i, 7): ID AU Le > 
g-5}. Noe that B # 0. Now we choose pair (7’, 7’) such that 7’ = min{i|(i,7) € 
B} and j’ = max{j | (’,7) € B}. By the choice of the pair (7’, 7’), note that 
Dy A Ly-1 = 0 and Dy AO Liv+;741 = 0. Using the properties of a BFS-tree, we 
note that for any vertex v € (DN eae Lx), any neighbor of v belongs to one of 
the levels Dy-1, Ly, euers Lit 454i: Let A = {xj_2, DLy!—1,++- Behr 9-4 Ye Note that, 
|A| = 7’ +4. Since V(P) isa dominating set of G and each vertex z € L; is 
adjacent to either x;_1 or 2%, Vite i Ly C N/A]. Now by updating D, we will 
find another minimum PD-set Dd, such that A C Diy 


Case 1: If x_2 ¢ D, and |A| is even. 

Since D,N Liv tj 41 = 0, vv45-41 € Dp. If xyv—2 ¢ Dy and |A| is even then the 
set Di, = (Dp Gg baae +" L,) UA is a PD-set of G with |Di,| < |Dp|, a contradiction 
to the choice of Dp. 


Case 2: If x2 ¢ D, and |A| is odd. 

Note that |A| is odd and G[A] is a path, if we include A in a PD- 
set we can pair all the vertices in A except one. We pair (x2, 2-1), 
(ay 5 Lir41); eaey (Gi45r—1, Li 45): Now we need to pair Li4+y/+1- If (Ne (xir45'41)\ 
{vi45'}) Cc Dy and (Ne (air45/41) \ {ti45'}) ia (Daiicess U a) = @). In this 
case using the property of path P, note that all the vertices in Ly +;741 is adjacent 
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to 2445". Hence the set Di, = (Dp TG ae 2 Et A {xj +;’41}) is a PD-set of G 
with |D7,| < |Dp|, a contradiction. If there is a vertex u € Ne (air+5741) \ {av gyr} 
such that u ¢ D, or if Ne(xi45r41) \ {vv43/} C Dp but there is a ver- 
tex u © Ne(av4j41) \ {av4,;} such that u € (Layay ag Bs. os then take 


A' = AU {u}. Note that the set Di, = (Dp \ Up Ly) U A’ is a PD-set of G, 


implying that |Di,| > |Dp|. Also we have |Dp nuit Del > 7’ +5 and |A’| = 7’+5 
implying that Di | < |D,|. Hence D,, is also a minimum PD-set of G. 


Case 3: If x2 € D, and |A| is even. 
Proof is omitted due to space constraints. 


Case 4: If x2 € D, and |A| is odd. 

Proof is omitted due to space constraints. 

Further, note that if i’ is 0 or 1, we can choose A = {2o,%1,..., 245/41} 
if |{vo,@1,..., Ui45/41}| is even, otherwise we can choose A = { p,Bis sis 
Ty 4j'41,u}, where u € N(2x4;/41). We can show the existence of u as we did 
above. In both the cases |A| < j’ +4 implying that D/, = (Dp \Ujt2 Ly) UA 
is a PD-set of G having cardinality less than the minimum cardinality PD- 
set D, of G, a contradiction. Hence 7’ ¢ {0,1}. Similarly we can claim that 
v+y €{l-1,]}. 

We call this replacement of Dp, with Di, an exchange step. Now, if |D,,M 
Ue, Le| < 9 +4 for all i € {0,1,...0} and j € {0,1,...1 — 1} then G has a 
minimum paired dominating Dj, satisfying the condition given in Theorem 4. 
Otherwise, let B’ = {(i, 7) : ID! AU Le| > 7 + 5}. Suppose (i,7) € B’. Now 
we will show that 7 > 2’. By contradiction suppose, 7 < 7’. In this case note 
that i+ 7 > i’ — 2 otherwise, (7,7) € B, contradicting the choice of i’. Also, 
|[D, O£,| = 1 for allt € {i’ —2,2’—1,..., a7 +7’ +1}. Hence for (7,7) € B’ with 
i<wandi+j > i’—2 there exits a 7’ such that (i, j') € B’ and i+j" >it +1. 
By construction of Dj, we note that |DpM (pe Lx| > |DLN Une Ly) > 9’ +5 
implying that (i, 7’) e B, a contradiction to the choice of i’ or 7’. Hence i > 7’. 
Therefore, if i” = min {i | (i,j) € B’} then i” > i’. 

This implies that, at every exchange step, we replace a minimum cardinal- 
ity PD-set D, with an updated minimum cardinality PD-set Diy After each 
exchange sep we note that the smallest value of 7 for ahi there was a 
j € {0,1,...,1 — i} satisfying |D, AUG, Le| > 7 +5, for the minimum car- 
dinality PD- ae Dy, will increase. Therefore, we conalacs that, if we start with 
minimum cardinality PD-set D,, we obtain a minimum cardinality PD-set 

, such that |DSU, 2, Lal < j+4 for all i € {0,1,...U} and j € {0,1,...1—d}, 
a exenuitiny at most d exchange steps. 


Now we are ready to present an algorithm to compute a minimum cardinality 
PD-set of an AT-free G. Using Theorem 4, we may conclude that there is a 
minimum PD-set of G that contains at most 6 vertices from any three consecutive 
BFS-levels of «, where (x,y) is a dominating pair of G. The idea behind our 
algorithm is the following: 
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In our algorithm, we explore a BFS-level of x in each iteration. In the i'”- 


iteration of the algorithm, we do the following: 


store all the possible sets X’ C WE L,; such that X’ dominates all the 


j= 


vertices till i“”-level. 

ensure that all the vertices in X’N (U 
not be paired with a vertex at level 7 + 2 or above. 

for every possible set X’, store another set X = X'N (L;U Lj41) 


a 


5=0 L;) are paired as these vertices can 


The set X helps in extending a partial solution X’ to the next level as we 


are restricted to select at most 6 vertices from any three consecutive levels in a 
minimum PD-set. Below, we have provided the detailed algorithm for computing 
a minimum cardinality PD-set D, of an AT-free graph G. The set D, maintains 
the property that it contains at most 6 vertices from any three consecutive BFS- 
levels of x. 


Algorithm 1: Minimum Paired Domination in AT-free Graphs 


Input: A connected AT-free graph G = (V, £) with a dominating pair (a, y); 
Output: A PD-set Dp of G; 

Compute the BFS-levels of x; 

For 0 <i<l, let Lj ={w € V | dg(a, w) = i} denote the set of vertices at level 
i in the BFS of G rooted at u. 

In particular, Lo = {x}. 

Initialize the queue Q: which contains an ordered tuple (X, X, size(X)) for all 
non-empty X C N{2] such that size(X) = |X| <6; 

Initialize i = 1; 

while (Qi 4 @ andi <1) do 

Update i=i+ 1; 

for (each element (X, X’, size(X")) of the queue Qi-1) do 

for (every U C L; with |X UU| < 6) do 

if (Li-1 C N[X UU] and there exists a set U'’ CU such that 
G[X’UU’'] has a perfect matching ) then 

Y= (KUO) Bis 

Y’=X'UU; 

size(Y’) = size(X') + |U]; 

if (for all element (X, X', size(X')) of Qi, X AY) then 

| insert (Y,Y’, size(Y’)) in the queue Qi; 

if (there is a tuple (Z, Z’, size(Z')) in Q; such that Z=Y and 
size(Y’) < size(Z’)) then 

delete (Z, Z', size(Z’)) form Q;; 

insert (Y,Y’, size(Y’)) in Q;; 


Among all the triples (X, X', size(X’)) in the queue Q; that satisfy L;) C N[X] 
and G[X’] has a perfect matching, find one such that size(X’) is minimum, say 
(D, D’, size(D')); 

Dp =D'; 

return Dp; 
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Now we prove the following theorem to show that the Algorithm 1 returns a 
minimum PD-set. We also analyse the running time of the algorithm. 


Theorem 5. Let G = (V,E) be an AT-free graph such that |\V| =n and |E| = 
m. Algorithm 1 computes a minimum cardinality PD-set of G in O(n®-°)-time. 


Proof. Proof is omitted due to space constraints. 


5 Paired Domination in Planar Graphs 


In this section we show that the DECIDE PD-sSET problem is NP-complete even 
when restricted to planar graph. For this purpose, we will give a polynomial 
reduction from the MINIMUM VERTEX COVER(MIN-VC) problem to the MIN- 
PD problem. In a graph G = (V, £), a vertex cover is a set C C V such that C 
has at least one end point of every edge e € E. The MIN-VC problem require to 
compute a minimum cardinality vertex cover of a given graph G. The following 
theorem is already proved for the MIN-VC problem. 


Theorem 6. /15]/ The MIN-VC problem is NP-hard for the planar cubic 
graphs. 


Now, we prove the main result of this section. 


Theorem 7. The DECIDE PD-SET problem is NP-complete for planar graphs 
with maximum degree 5. 


Proof. Clearly, the DECIDE PD-SET problem is in NP. To show the hardness 
of the problem, we give a reduction from MIN-VC problem which is NP-hard 
for planar cubic graphs, by Theorem 6. Let G = (V, £) be a planar cubic graph 
with V = {v1,v2,...,Un}. We transform the graph G into a graph G’ = (V’, E’) 
as follows: 


— replace each vertex vu; € V with the gadget G,, as shown in the Fig. 3 
— If three edges e;,e,,€; were incident on v,; in G, then in G’, we make e; 
incident on v;, ex incident on v? and e; incident on v3. 
We note that the graph G’ is a planar graph with maximum degree 5, and 
G’ can be computed from G in polynomial time. Now, to prove the result we 
only need to prove the following claim: 


Claim. If 6(G) denotes the cardinality of a minimum vertex cover of G, then 
Ypr(G’) = 4n + 28(G), where n denotes the number of vertices in G. 


Proof. Let V° be a minimum cardinality vertex cover of G. Let D, = 
{U7,4p 07, Yi, UP, 2 | vi © VOU {y?, 27,47, ef | vi € V°} where 7 € [n]. Note 


that if vu; ¢ V°, then all the three vertices adjacent to v; in G must be present 
in V°. Using this fact, it can be easily verified that D, is a PD-set of G’, and 
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Fig. 3. Gadget G,, used in the construction of graph G’ from G in Theorem 7. 


|Dp| = 6 - B(G) +4- (n — B(G)) = 4n + 26(G). Therefore, if D> is a minimum 
cardinality PD-set of G’ then |D5| < 4n + 2(G). Hence, we have 


Ypr(G") < 4n + 28(G) (1) 


Conversely, suppose Dp is a minimum cardinality PD-set of G’. Then, 
to dominate the vertex x}, Dp 1 {x}, y;,y?} must be non-empty. Further, a 
vertex u € {zx}, y},y7} 9 Dy can only be paired with a vertex in the set 
{v7 25, Yi Yi, 2} \ fu}. Hence, |Dp M {v;,2;,y;,¥7,2;}| 2 2. Similarly, we 
have |D, ON {v3, 23, y2, yf, 22}| > 2. Therclon, for each i € [n], we have 
ID, AV (Gs,)| S . Note that ° dominate xz}, D,  {ai, yj, y7} 74 0. ~ 
ther, to dominate a, Deli {ee a arts U. Sienilarly. to dominate x? and a? 
D anf? ye ut x 0 anid "D Ue 27,0e+ 20 cap en Therefore, we obene 
that, if |Dp NV(G»,)| = 4, then Dp NV(Go,) = {y?, 21, y?, 23 }- 

Now, we prove that we can paste D, such that D, remains a minimum 
cardinality PD-set of G’ and for each i € [n], |D, V(G,,)| = 4or |D, 
V(G»,)| = 6. Suppose |D, N V(Gz,)| = 5 for some 7 € [n]. As we observed, 
the vertices dominating x} and x? are paired with the os of V(Gy,), and 
|DpNV(Gy,)| > 4. Hence if |DpNV(Go,)| = i, uz, v2} #0, as only 
these vertices of the gadget G,, can be paired with a ees of another gadget. 


Case 1: Suppose vj € Dp. 

In this case, first we show that D, N {v?,v3} = @. Note that uj is paired 
with a vertex of some other gadget, and v} is not dominating x}. Further, if u 
is the vertex dominating vertex x} then u can only be paired with a vertex in 
the set {x}, y?,y7, 27} \ {u}. Therefore, |Dp N {vj ci, yr, ¥2, 2 }| > 3. Also as 
[Dp N{u?, 22, y?, yf, 22 }| > 2 and |D, nVv(G, ;)| = 5, we have v3 ¢ D,. Further, 

as |Dp al {07,23 YE YF z7}| 23 SHOE StORS ID, al {v2,02,y3,y!, 23}| = = 2. Now, 
if v? € D, then v? is pate with y? this leaves the vertex z? undominated, a 
eouteadicnion, EBON v? ¢ Dy. This concludes that D, n {v2, ve} =D. 

Now, let v} is paired with a vertex u of another gadget: say Gy, where 
i # 7. Note that we 10590505}. It is ed to observe that |D,M V(Gy | 
5. Suppose v} is paired with v; . Now if OF ¢ D, then wee Dy as follows: 


Dy = De V(Ge) Utley ere Put and pair U; with Yj: Now, suppose that 
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y; already belongs to Dy. Note that y; is paired with either y7 or a}. If both 
y; and x; € D, then y; must be paired with 2}. In this case, the set Dj, = 
Dy \ (V(Go,) U {aj}) U {y?, y?, 2], 22, yj} where v; is paired with y; is a PD-set 
of G’ and |Dj,| < |Dp|, a contradiction. Therefore, in this case either y7 ¢ D, or 
a; ¢ Dy. If x; ¢ Dy then y; is pare va y; and in us case, we update D, es 
follows: Dp = Dy \ V(Gu,) U {y?, 92, 22, 23, oe}, pair vj; with yj and y? td. x} 
We can update D, in a similar way if uy; ¢ Dp. 

Similarly we can update D, if v} is paired with v5 . Now suppose v} is 
pared with hes If 2° ¢ D, then apa D, as follows: De = = Dy \ V(Gez) VU 
1g? 478 2 388, 23} sad pair v3 with we But, if 2 € Dy, we may aise that 
it is possible to update D, by giving similar areutianis as above with suitable 
modifications, such that vs is paired with z7. After update in each case, we may 
note that |D,1V(G»,)| = 4 and |D, NV(G»,)| = 6. 


Case 2: Suppose v? € Dy. 
The arguments are similar to Case 1. 


Case 3: puppose wo € D,. 

Let v? is paired with a vertex u of another gadget G,,,. Since, |D,NV (Gy, )| = 
5 have IDp 0 {ut 2t yt, y2, 22 }| = 2 and [Dp iN {8,03 8,uf 23}| = 2. Now, 
if vj € D, then v; is paired with yj this leaves the vertex zj Faces a 
contradiction. Similarly, if v? € D, then v? is paired vale y? this leaves the 
vertex 2? undominated, a Sontradicticns Hence, D, M {v;, v7} = 0. Now we can 
give sano arguments as Case 1, to show that D, can be updated such that 
IDp NV(G.,)| = (Gu) 6. 

Now, without loss of generality, we may assume that there exists a minimum 
cardinality PD-set of G’ such that for each 7 € [n], |Dp, N V(G»,)| = [Dp A 
V(Go,)| 2 

Define V° = {u; € V | |Dp NV(G2,)| > 6}. Next, we claim that V° is a 
vertex cover of G. Consider any two distinct vertices v; and v; in G such that 
ujv; € E(G). We prove that either |Dp 1 V(Gy,)| = 6 or |Dp NV(G.,)| = 6. Let 
v® is made adjacent to Ce where k,k’ € [3]. Note that if |D,NV(G),)| = 4 
and |D, V(G»,)| = 4 then from above observation, we have D, 1 V(Gy,) = 
{y7, 2), y3, 22} and DpNV(Gy,) = {y7, 27, y?, 27}, this leaves the vertices vf and 
ve undominated, a contradiction. Therefore, V° is a vertex cover of G. Also, 
Ypr(G") > 6|V°| + 4(n — |V°|). So, we have 2|V°| < ypr(G’) — 4n. Hence, 


23(G) < Ypr(G") — 4n (2) 


Therefore, using Eqs. 1 and 2, we have 7p,(G’) = 4n + 26(G). This proves 
the claim. 


Since, the MIN-VC problem is NP-hard for cubic planar graphs, from above 
claim we conclude that the DECIDE PD-SET problem is NP-complete for planar 
graphs with maximum degree 5. 
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6 Concluding Remarks 


In this paper, we resolve the complexity of the MIN-PD problem for planar 
graphs and AT-free graphs. We proposed a polynomial time algorithm for MIN- 
PD problem in AT-free graphs. We also proposed a 2-approximation algorithm to 
compute a PD-set in AT-free graphs. Since the class of AT-free graphs include the 
class of cocomparability graphs, the results and algorithms presented for paired 
domination in AT-free graphs, also holds for cocomparability graphs. We further 
investigated the computational complexity of the problem in planar graphs and 
proved that the problem is NP-hard. The complexity of the problem is still not 
known in circle graphs. One may be interested in investigating the complexity 
status of the M1n-PD problem in circle graph. Further, it is interesting to design 
more efficient algorithm for the problem in AT-free graphs and cocomparability 
graphs. 
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Abstract. A k-star colouring of a graph G is a function f : V(G) > 
{0,1,...,k—1} such that f(w) 4 f(v) for every edge wu of G, and G does 
not contain a 4-vertex path bicoloured by f as a subgraph. For k € N, 
the problem k-STAR COLOURABILITY takes a graph G as input and asks 
whether G is k-star colourable. By the construction of Coleman and Moré 
(SIAM J. Numer. Anal., 1983), for all k > 3, k-STAR COLOURABILITY is 
NP-complete for graphs of maximum degree d = k(k — 1+ [Wk]). For 
k =4 and k = 5, the maximum degree in this NP-completeness result is 
d = 20 and d = 35 respectively. We reduce the maximum degree to d = 4 
in both cases: i.e., 4-STAR COLOURABILITY and 5-STAR COLOURABILITY 
are NP-complete for graphs of maximum degree four. We also show that 
for allk > 3andd < k, the time complexity of k-STAR COLOURABILITY is 
the same for graphs of maximum degree d and d-regular graphs (i.e., the 
problem is either in P for both classes or NP-complete for both classes). 


Keywords: Graph coloring - Vertex coloring - Star coloring - 
Complexity 


1 Introduction 


The star colouring is a well-known variant of (vertex) colouring introduced 
by Griinbaum [7] in the 1970s. The scientific computing community indepen- 
dently discovered star colouring in the 1980s and used it for lossless compres- 
sion of symmetric sparse matrices, which is in turn used in the estimation of 
sparse Hessian matrices (see the survey [6]). A k-colouring f of a graph G, say 
f: V(G) = {0,1,...,4 —1}, is a k-star colouring of G if G does not contain a 
4-vertex path bicoloured by f as a subgraph. For every positive integer k, the 
problem k-STAR COLOURABILITY takes a graph G as input and asks whether 
G is k-star colourable. The problem k-COLOURABILITY is defined likewise. The 
problem STAR COLOURABILITY takes a graph G and a positive integer k as 


input and asks whether G is k-star colourable. 
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The complexity of star colouring is studied in various graph classes. STAR 
COLOURABILITY is polynomial-time solvable for cographs [10] and line graphs 
of trees [12]. For the class of co-bipartite graphs, STAR COLOURABILITY is NP- 
complete eventhough k-STAR COLOURABILITY is polynomial-time solvable for 
every k € N [2,13]. Coleman and Moré [3] proved that for all k > 3, k- 
STAR COLOURABILITY is NP-complete for bipartite graphs. The problem 3-STAR 
COLOURABILITY is NP-complete for planar bipartite graphs [1], line graphs of sub- 
cubic graphs [9] and graphs of arbitrarily large girth [2]. Gebremedhin et al. [5] 
produced an inapproximation result on star colouring of bipartite graphs. In this 
paper, we focus on the classes of bounded degree graphs and regular graphs. 

For motivation, let us look at the complexity of colouring in bounded degree 
graphs. It is well-known that for all k > 3, k}COLOURABILITY is NP-complete. 
Emden-Weinert et al. [4] proved that k-COLOURABILITY remains NP-complete 
when restricted to graphs of maximum degree d = k — 1+ [Vk]. For sufficiently 
large k, the maximum degree in this NP-completeness result is the minimum 
possible (because the problem is in P for d < k —1+ [Vk] [11, Theorem 43]). 
Coleman and Moré [3] proved that for k > 3, k-STAR COLOURABILITY is NP- 
complete for bipartite graphs. From their construction, it follows that for all k > 
3, k-STAR COLOURABILITY is NP-complete for graphs of maximum degree d = 
k(k—-1+[Vk]). We seek to reduce the maximum degree in this NP-completeness 
result to the minimum possible. For k = 3, Lei et al. [9] reduced the maximum 
degree to d = 4 (because 3-STAR COLOURABILITY is NP-complete for line graphs 
of subcubic graphs). No similar result exists for k > 3. For k = 4 and k = 5, 
the maximum degree of the output graph in Coleman and Moré’s construction 
is d = 20 and d = 35 respectively. We reduce the maximum degree to d = 4 
in both cases: i.e., 4-STAR COLOURABILITY and 5-STAR COLOURABILITY are 
NP-complete for graphs of maximum degree four. 

It is easy to show that for all k and d, the time complexity of k- 
COLOURABILITY is the same for graphs of maximum degree d and d-regular 
graphs (i.e., the problem is either in P for both classes or NP-complete for both 
classes). Such a property does not hold in general for star colouring: 3-STAR 
COLOURABILITY is NP-complete for graphs of maximum degree four [9] whereas 
it is in P for 4-regular graphs [15]. We show that for & > 3 and d < k, the 
time complexity of k-STAR COLOURABILITY is the same for graphs of maximum 
degree d and d-regular graphs. We briefly discuss the consequences of our results 
to hardness transitions of the problem k-STAR COLOURABILITY restricted to the 
class of graphs of maximum degree d (resp. d-regular graphs). 

The paper is organized as follows. See Sect. 2 for definitions. The hardness 
results on bounded degree graphs and regular graphs appear in Sects. 3 and 4, 
respectively. 


2 Definitions 


All graphs considered in this paper are finite, simple and undirected. We follow 
West [14] for graph theory terminology and notation. The girth of a graph G is the 
length of a shortest cycle in G. A k-colouring of a graph G is a function f from 
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the vertex set of G to a set of k colours, say {0,1,...,4 — 1}, such that f maps 
every pair of adjacent vertices to different colours. A k-colouring f of Gis a k-star 
colouring if G does not contain a 4-vertex path bicoloured by f as a subgraph. 

For every positive integer k, the decision problem k-COLOURABILITY takes 
a graph G as input and asks whether G is k-colourable. The problem k-STAR 
COLOURABILITY is defined likewise. To denote the restriction of a decision prob- 
lem, we write the conditions in parenthesis. For instance, 4-STAR COLOURABIL- 
ITY(A = 4, girth= 5) denotes the problem 4-STAR COLOURABILITY restricted 
to the class of graphs G with the maximum degree A(G) = 4 and girth(G) = 5. 
For every construction in this paper, the output graph is made up of gadgets. 
For every gadget, only some of the vertices in it are allowed to have edges to 
vertices outside the gadget; we call these vertices as terminals. In diagrams, we 
draw a circle around each terminal. A decision problem is said to have a hard- 
ness transition with respect to a parameter d at a point d = x if either (i) the 
problem is in P for d = 2—1 and it is NP-complete for d = 2, or (ii) the problem 
is NP-complete for d = x — 1 and it is in P for d= a. 


3 Bounded Degree Graphs 


In this section, we prove that 4-STAR COLOURABILITY and 5-STAR COLOURA- 
BILITY are NP-complete for graphs of maximum degree four. To this end, we 
employ two similar constructions named Construction 1 and Construction 2. 
Detailed proofs are omitted from Sect. 3.1 due to space constraints. 


3.1 4-Star Colouring 


We use Petersen graph minus one vertex as the gadget component to build 
gadgets in Construction 1. 


Construction 1. 

Input: A 4-regular graph G. 

Output: A graph G’ of maximum degree four and girth five. 

Guarantee: G is 3-colourable if and only if G’ is 4-star colourable. 

Steps: 

Let v1, V2,...,Un be the vertices in G. First, replace each vertex of G by a vertex 
gadget as shown in Fig.1. The vertex gadget for v; has five terminals, and the 
terminals v;,1, Ui,2, Vi,3, Via accommodate the four edges incident on vu; in G ina 
one-to-one fashion (order does not matter). So, corresponding to each edge vv; 
in G, there is an edge v;,~v;,¢ in G’ for some k, @ € {1, 2,3, 4}. Finally, introduce 
the chain gadget displayed in Fig.2, and join v;,9 to vj; for i= 1,2,...,n. 


Proof of Guarantee (Overview). The following claims are pivotal to the proof. 


Claim 1: Every 4-star colouring of the gadget component (i.e., Petersen graph 
minus one vertex) must assign the same colour on all three degree-2 vertices of 
the gadget component. 
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Fig. 1. Replacement of vertex by vertex gadget. 


UT v5 Uv; ve 


Fig. 2. Chain gadget in Construction 1. 


Claim 2: For every 4-star colouring of the vertex gadget (resp. chain gadget), 
all terminals of the gadget get the same colour. 


Claim 3: For every v € V(G) and every colour c # 0, the vertex gadget for 
v admits a 4-star colouring such that all terminals of the gadget have colour c, 
no neighbour of v;,9 is coloured 0, and neighbours of v;,1, v;,2, vi,3, Via are all 
coloured 0. 


Claim 4: The chain gadget admits a 4-star colouring such that all terminals of 
the gadget get colour 0. 


The proofs of Claims 1, 3 and 4 are omitted. Claim 2 follows from Claim 1. 
Suppose that G admits a 3-colouring f: V(G) — {1,2,3}. The following 
steps give a 4-star colouring of G’: (i) for each v € V(G), choose c = f(v) 
and colour the vertex gadget for v by the colouring guaranteed in Claim 3, and 
(ii) colour the chain gadget by the colouring guaranteed in Claim 4. 
Conversely, suppose that G’ admits a 4-star colouring f’. By Claim 2, all 
terminals of a vertex gadget (resp. chain gadget) get the same colour under f’. 
Without loss of generality, assume that f’(v;) = 0 for i = 1,2,...,n. Since v; ov} 
is an edge for 1 <i < n, the chain gadget forbids colour 0 at vertices v;,; for 
1<i<nand1<j<4. Therefore, the function f: V(G) — {1,2,3} defined as 
f(vi) = f' (vio) for 1 <i <n is a 3-colouring of G. 
Construction 1 establishes a reduction from 3-COLOURABILITY(4-regular) to 
4-STAR COLOURABILITY(A = 4, girth = 5). Note that Construction 1 requires 
only time polynomial in m+n because |E(G")| = 61n+m and |V(G’)| = 41n+1 
(where m = |E(G)| and n = |V(G)|). Thus, we have the following theorem. 


Theorem 1. 4-STAR COLOURABILITY is NP-complete for graphs of maximum 
degree four and girth five. 


82 M. A. Shalu and C. Antony 


3.2 5-Star Colouring 


We show that 5-STAR COLOURABILITY is NP-complete for graphs of maxi- 
mum degree four. Construction 2 below is employed to establish a reduction 
from 3-COLOURABILITY (4-regular) to 5-STAR COLOURABILITY(triangle-free, 4- 
regular). Construction 2 is similar to Construction 1, albeit a bit more com- 
plicated. For instance, we will need two chain gadgets this time because two 
colours should be forbidden. The gadgets used in the construction are made of 
two gadgets called 2-in-2-out gadget and not-equal gadget. These are in turn 
made of one fixed graph namely Grotzsch graph minus one vertex; we call it the 
gadget component (in Construction 2) for obvious reason. The gadget compo- 
nent is displayed in Fig.3a. The following lemma explains why it is interesting 
for 5-star colouring (the proof is omitted). 


Lemma 1. Under every 5-star colouring of the gadget component, the degree-2 
vertices of the graph should get pair-wise distinct colours. Moreover, every 5-star 
colouring of the gadget component must be of the form displayed in Fig. 8b or 
Fig. 3c upto colour swaps. 


W2 W3 W2 W3 
W2 W3 


1 2 1 2 
pS V2 U3 V2 U3 
(1 Vy O}aU1 V4 B|3 0) [3 O]aU1 U4 {3 
ea 2 e wy I 
Wi W4 Wi W4 
wr wa v4 \/ 
W5 W5 


Ws 1 2 


(a) (b) (c) 


Fig. 3. (a) Gadget component, (b, c) General form of 5-star colouring of it. 


The 2-in-2-out gadget is displayed in Fig. 4. Observe that two copies of the 
gadget component are part of this gadget. The following lemma shows why 5-star 
colouring of this gadget is interesting (the proof is omitted). 


Lemma 2. For every 5-star colouring f of the 2-in-2-out gadget, there exist two 
distinct colours cy and cy such that f(y1) = f(y2) = flzet) = f(z) = a and 
f(yt) = flys) = f(a) = f(z.) = ca. Moreover, every 3-vertex path containing 
one of the pendant edges of the gadget is tricoloured by f. 


The not-equal gadget is the graph displayed in Fig. 5. The not-equal gadget 
is made from one 2-in-2-out gadget by identifying vertex yj of the 2-in-2-out 
gadget with vertex y and identifying vertex zf with vertex z3. Hence, the next 
lemma follows from Lemma 2 (note that c, # co in Lemma 2). 
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Fig. 4. (a) The 2-in-2-out gadget, (b) its symbolic representation, and (c) a 5-star 
colouring of the gadget. 


Lemma 3. The terminals of the not-equal gadget should get different colours 
under each 5-star colouring f. Moreover, every 3-vertex path within the gadget 
with a terminal as one endpoint is tricoloured by f. 


We are now ready to present the construction. 


Construction 2. 

Input: A 4-regular graph G. 

Output: A triangle-free graph G’ of maximum degree four. 
Guarantee: G is 3-colourable if and only if G’ is 5-star colourable. 
Steps: 


Let v1, v2,...,Un be the vertices in G. First, replace each vertex vu; of G by a ver- 
tex gadget as shown in Fig.6. The vertex gadget for v; has six terminals namely 
Ui,0> Vi,1, Vi,2, Vi,35 Vi,4 and Ui,5- The terminals Vi,1; Vi,2, Vi,3, Vi,4 accommodate the 
edges incident on v; in G. The replacement of vertices by vertex gadgets con- 
verts each edge 1,0; of G to an edge between terminals v;,, and vj;,¢ for some 
k,é€ {1,2,3,4}. 
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Fig. 5. (a) A not-equal gadget between terminals y and z (it is made of one 2-in-2-out 
gadget), and (b) its symbolic representation. 


Next, replace each edge v;,,v;,¢ between terminals by a not-equal gadget 
between vu; , and uv; (that is, introduce a not-equal gadget, identify one terminal 
of the gadget with vertex v;,, and identify the other terminal with the vertex 
v;,¢). Next, introduce two chain gadgets. The chain gadget is displayed in Fig. 7. 

Next, add a not-equal gadget between v;,9 and vj, for 1 <7 <n. Similarly, 
introduce a not-equal gadget between v;,5 and vj, for 1 <i <n. Finally, add a 
not-equal gadget between x11 and 219. 


Proof of Guarantee. For convenience, let us call the edges y1yj}, yays of a 2-in-2- 
out gadget (see Fig. 4) as in-edges of the 2-in-2-out gadget, edges 2127, 2925 as 
out-edges of the 2-in-2-out-gadget, vertices yj, y5 as in-vertices of the 2-in-2-out 
gadget, and vertices zj,z3 as out-vertices of the 2-in-2-out gadget. The next 
claim follows from Lemma 2. 


Claim 1: If an in-edge of a 2-in-2-out gadget is an out-edge of another 2-in-2- 
out gadget, the colour of the out-vertices of both gadgets must be the same. 


Next, we point out a property of the vertex gadget and the chain gadget. 


Claim 2: All terminals of a vertex gadget (resp. chain gadget) should get the 
same colour under a 5-star colouring. 


By Claim 1, if an in-edge of a 2-in-2-out gadget is an out-edge of another 
2-in-2-out gadget, the colour of out-vertices of both gadgets must be the same. 
Repeated application of this idea proves Claim 2. 

We are now ready to prove the guarantee. Suppose that G admits a 3- 
colouring f: V(G) > {2,3,4}. A 5-colouring f’ : V(G’) — {0,1,2,3,4} of G’ is 
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Fig. 6. Replacement of vertex by vertex gadget. 


Level 3 


constructed as follows. First, assign f’(u;,;) = f(vi) for 1<i<nand0<j <5. 
Extend this into a 5-star colouring of the vertex gadget by using the scheme in 
Fig. 4c on each 2-in-2-out-gadget within the vertex gadget (use the scheme in 
Fig. 4c if f’(v;,;) = 4; suitably swap colours in other cases). To colour the first 
chain gadget, colour each 2-in-2-out gadget within this chain gadget using the 
scheme obtained from Fig. 4c by swapping colour 4 with colour 0. Similarly, for 
the second chain gadget, colour each 2-in-2-out gadget within the chain gadget 
using the scheme obtained from Fig. 4c by swapping colour 4 with colour 1. To 
complete the colouring, it suffices to extend the partial colouring to not-equal 
gadgets. For each not-equal gadget between two terminals, say terminal y and 
terminal z, colour the 2-in-2-out gadget within the not-equal gadget using the 
scheme obtained from Fig. 4c by swapping colour 3 with colour f’(y) and swap- 
ping colour 4 with colour f’(z). 

By Lemma 2 and Lemma 3 (see the second statements in both lemmas), 
every 3-vertex path in any gadget in G’ containing a terminal of the gadget as 
an endpoint is tricoloured by f’. In addition, the construction of the graph G’ 
is merely glueing together terminals of different gadgets. Therefore, there is no 
P, in G' bicoloured by f’; that is, f’ is a 5-star colouring of G’. 

Conversely, suppose that G’ admits a 5-star colouring f’: V(G’) — 
{0,1,2,3,4}. By Claim 2, all terminals of a vertex/chain gadget should have 
the same colour under f’. As there is a not-equal gadget between 21,1 and 
1.2, f'(t1i) # f'(t1,2) (by Lemma 3). Without loss of generality, assume 
that f’(vi1) = 0 and f’(a1,2) = 1. By Claim 2, all terminals of the first 
chain gadget have colour 0; that is, f’(%11) = f(#21) = f"(v71) = 0 for 
1 <i< n. Similarly, all terminals of the second chain gadget have colour 1; 
that is, f’(t12) = f'(v22) = f'(vfy) = 1 for 1 < i < n. By Claim 2, 


86 M. A. Shalu and C. Antony 
] @ 
cl) Level 1 


I) a} Level 2 


Lk LLL 
pl [ve] 
be gla 


* * 
Un—1,t Un,t 


Level [24+] 


Fig. 7. t-th chain gadget in Construction 2 if n is even where t = 1 or 2. If n is odd, the 
t-th chain gadget is the same except that it has only n+ 1 terminals vj 1, 034,-..,Un,t 
and x1,,. A chain gadget is similar to a vertex gadget; the only difference is that it has 
more levels and terminals. 


all terminals of the vertex gadget for v,; have the same colour under f’, say 
colour c. Since there is a not-equal gadget between v1,9 and vj, we have 
c= f'(vio) # f' (vt) = 0. Since there is a not-equal gadget between v;,5 and 
Vig, we have c= f’(v1,5) # f’(vf,2) = 1. So, c € {2,3,4}. Hence, for 0 < 7 <5, 
f'(vr,3) € {2,3,4}. Similarly, for 1 <i<nand0<j <5, f’(u;,;) € {2,3,4}. 
Moreover, whenever v,;v; is an edge in G, there is a not-equal gadget between ter- 
minals v;,, and v;¢ in G’ for some k, £ € {1,2,3,4} and hence f’ (vin) 4 f’(v;,2). 
Therefore, the function f: V(G) — {2,3,4} defined as f(v;) = f’(vi,9) is indeed 
a 3-colouring of G. This proves the converse part and thus the guarantee. 


Theorem 2. 5-STAR COLOURABILITY is NP-complete for triangle-free graphs 
of maximum degree four. 


Proof. We employ Construction 2 to establish a reduction from 3- 
COLOURABILTY(4-regular) to 5-STAR COLOURABILITY(triangle-free, A = 4). 
Let G be an instance of 3-COLOURABILTY(4-regular). From G, construct an 
instance G’ of 5-STAR COLOURABILITY(triangle-free, A = 4) by Construction 2. 

Let m = |E(G)| and n = |V(G)|. In G’, there are at most 6n+m+2(1+2+ 
“+++ [(n+1)/2] +n) +1 < $(n? + 46n + 12) 2-in-2-out gadgets and in addition 
at most 16n + 8 vertices and 32n + 12 edges. So, G’ can be constructed in time 
polynomial in n. By the guarantee in Construction 2, G is 3-colourable if and 
only if G’ is 5-star colourable. 
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4 Regular Graphs 


We prove that for allk > 3 andd < k, the complexity of k-STAR COLOURABILITY 
is the same for graphs of maximum degree d and d-regular graphs. That is, for 
all k > 3 and d< k, k-STAR COLOURABILITY restricted to graphs of maximum 
degree d is in P (resp. NP-complete) if and only if k-STAR COLOURABILITY 
restricted to d-regular graphs is in P (resp. NP-complete). First, we show that 
for all k > 3, the complexity of k-STAR COLOURABILITY is the same for graphs 
of maximum degree & — 1 and (k — 1)-regular graphs. 


Construction 3. 

Parameter: An integer k > 3. 

Input: A graph G of maximum degree k — 1. 

Output: A (k — 1)-regular graph G’. 

Guarantee 1: G is k-star colourable if and only if G’ is k-star colourable. 
Guarantee 2: If G is triangle-free (resp. bipartite), then G’ is triangle-free (resp. 
bipartite). 

Steps: 

Introduce two copies of G. For each vertex v of G, introduce (k — 1) — degg(v) 
filler gadgets (see Fig. 8) between the two copies of v. 


Fig. 8. A filler gadget for v € V(G). 


It is easy to show that each k-star colouring of G can be extended into a 
k-star colouring of G’ (detailed proofs of the guarantees in Construction 3 are 
omitted). Thanks to Construction 3, we have the following theorem. 


Theorem 3. For all k > 3, the complexity of k-STAR COLOURABILITY is the 
same for graphs of maximum degree k—1 and (k—1)-regular graphs. In addition, 
for k => 3, the complexity of k-STAR COLOURABILITY is the same for triangle- 
free (resp. bipartite) graphs of maximum degree k — 1 and triangle-free (resp. 
bipartite) (k — 1)-regular graphs. 


By Theorem 3, the complexity of 5-STAR COLOURABILITY is the same for 
triangle-free graphs of maximum degree four and triangle-free 4-regular graphs. 
Thus, by Theorem 2, we have the following. 


Theorem 4. 5-STAR COLOURABILITY is NP-complete for triangle-free 4- 
regular graphs. 
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Construction 4. 


Parameters: Integers k > 3 and d < k. 

Input: A graph G of maximum degree d. 

Output: A d-regular graph G*. 

Guarantee: G is k-star colourable if and only if G* is k-star colourable. 

Steps: 

Introduce two copies of G. For each vertex v of G, introduce d — degg(v) filler 
gadgets (see Fig. 9) between the two copies of v. 


Fig. 9. A filler gadget for v € V(G). 


To prove the guarantee, observe that G is a subgraph of G* and G* is a sub- 
graph of G’ (the output graph in Construction 3). Thus, we have the following. 


Theorem 5. For all k > 3 andd < k, the complexity of k-STAR COLOURA- 
BILITY is the same for (triangle-free/bipartite) graphs of maximum degree d and 
(triangle-free/bipartite) d-regular graphs. 


5 Conclusion 


A decision problem has a hardness transition with respect to a parameter d at a 
point d = « if either (i) the problem is in P for d = x—1 and it is NP-complete for 
d = z, or (ii) the problem is NP-complete for d = x—1 and it is in P for d = x (see 
[8]). For all & > 3, the problem k-COLOURABILITY restricted to graphs of maxi- 
mum degree d has exactly one point of hardness transition with respect to d (to 
produce a reduction, add a disjoint copy of Ky,441). By the same reasoning, star 
colouring displays similar behaviour when restricted to bounded degree graphs. 
That is, the problem k-STAR COLOURABILITY restricted to graphs of maximum 
degree d has exactly one point of hardness transition (w.r.t. d), say d = TH), For 
sufficiently large k, d= k —14+ [Vk] is the unique point of hardness transition 
(w.r.t. d) for the problem k-COLOURABILITY restricted to graphs of maximum 
degree d (see [4] and [11, Theorem 43]). In contrast, for the problem k-STAR 
COLOURABILITY in graphs of maximum degree d, the unique point of hardness 
transition (namely TO) is unknown even for sufficiently large k. Since 4-STAR 
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COLOURABILITY and 5-STAR COLOURABILITY are NP-complete for graphs of 
maximum degree four (see Theorems 1 and 2), T. (4) <4 and Te <4. 

In a forthcoming (unpublished) paper, we prove that for all k > 4, fe <k. 
We suspect that for all k > 5, (ha <k-—-1. 

When it comes to the class of regular graphs, colouring shows the same 
behaviour as in the class of bounded degree graphs because k-COLOURABILITY 
is NP-complete for d-regular graphs if and only if KCOLOURABILITY is NP- 
complete for graphs of maximum degree d. Such a property does not hold in 
general for star colouring; for example, 3-STAR COLOURABILITY is NP-complete 
for graphs of maximum degree four [9] whereas it is in P for 4-regular graphs [15]. 
We show that for allk > 3 andd < k, the complexity of k-STAR COLOURABILITY 
is the same for graphs of maximum degree d and d-regular graphs. The following 
observation is a consequence of this (the proof is omitted). 


Observation 1. If k >3 and T\*) < k—1, then between d=1 andd=k-—1, 
the problem k-STAR COLOURABILITY in d-regular graphs has exactly one point 
of hardness transition (w.r.t. d) namely d= rie), 


In a forthcoming (unpublished) paper, we show that for all k > 4, the problem 
k-STAR COLOURABILITY is NP-complete for graphs of maximum degree k and 
polynomial-time solvable for d-regular graphs for each d > 2k—4. In particular, 5- 
STAR COLOURABILITY is polynomial-time solvable for d-regular graphs for each 
d > 6. Since 5-STAR COLOURABILITY in d-regular graphs is NP-complete for d = 
4 (see Theorem 4) and in P for d < 2 as well as d > 6, 5-STAR COLOURABILITY 
in d-regular graphs has at least two points of hardness transition. In general, 
for k > 5, k-STAR COLOURABILITY in d-regular graphs has either zero or at 
least two points of hardness transition (the latter is more likely). We conjecture 
that there are exactly two points of hardness transition and the second point of 
hardness transition is close to 2k — 4. 


Conjecture 1. For k > 5, the problem k-STAR COLOURABILITY in d-regular 
graphs has exactly two points of hardness transition d = T{*) and d= ce and 
the second point of hardness transition ses satisfies (re) —(2k-—4)| <1. 
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Abstract. A natural constraint in real-world applications is to avoid 
conflicting elements in the solution of problems. Given an undirected 
graph G = (V,£) where each edge e € E has a positive integer weight 
w(e), and a conflict graph G = (V, F) such that V C E and each edge 
é = (e€1,€2) € E represents a conflict between two edges e1,e2 € E, in 
the MINIMUM CONFLICT-FREE SPANNING TREE (MCFST) problem we 
are asked to find (if any) a spanning tree avoiding pairs of conflicting 
edges (conflict-free) with minimum cost, i.e., a minimum solution among 
spanning trees T’ such that £(T') is an independent set of G.A spanning 
tree T of G is a feasible solution for an instance I = (G,G) of MCFST if 
E(T) is an independent set of G. In contrast to the polynomial-time solv- 
ability of MINIMUM SPANNING TREE, to determine whether an instance 
I = (G,G) of MCFST admits a feasible solution is WP-complete. In this 
paper, we present a multivariate complexity analysis of MCFST by con- 
sidering particular classes of graphs G and G. In particular, we show that 
the problem of determining whether an instance I = (G,G) of MCFST 
has a feasible solution is \“P-complete even if G is a bipartite planar 
subcubic graph, and Gisa disjoint union of paths of size three (P3). 
Moreover, we show that whether G is a complete graph and G is a dis- 
joint union of stars, then a feasible solution for I = (G, G) can be found 
in polynomial time. In addition, we present (in)approximability results 
for MCFST on complete graphs G, and an FPT algorithm parameter- 
ized by the distance to F of the conflict graph G, where F is a hereditary 
graph class such that MCFST on conflict graphs G € F can be solved 
in polynomial time. 


Keywords: Conflict-free - Spanning tree - Approximation - FPT 


1 Introduction 


Given an edge-weighted graph G,, the MINIMUM SPANNING TREE (MST) prob- 
lem consists of finding a spanning tree of G having minimum cost, where the 
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cost is the sum of the weights of its edges. MST is a very important problem 
with application in several areas. Besides that, the MST problem can be solved 
in polynomial time by either Kruskal or Prim algorithms [9,13]. A survey on 
MST can be found in [8]. 

Conflict-free variants of classical decision and optimization problems have 
aroused considerable interest and have been studied in the recent literature. A 
classical computational problem 4 can be turned into a conflict-free version of 
X by coupling a conflict graph G together with the instances Iy of ¥. In such 
a conflict graph, the vertices represent elements of J, and its edges represent 
pairs of elements that are prohibited from being mutually in the same solution. 
A solution for an instance (Ixy, G) of the conflict-free version of 4 represents, 
simultaneously, an independent set in G and a solution for the instance Iy of ¥. 

Several optimization problems have already been studied from the viewpoint 
of conflict-free versions, such as BIN PACKING [1,7], KNAPSACK [11], MAx- 
IMUM MATCHING and SHORTEST PATH [3]. In this work, we deal with the 
conflict-free versions of the MINIMUM SPANNING TREE problem. The CONFLICT- 
FREE SPANNING TREE (CFST) problem was introduced in [2] and consists in 
determining if a graph G has a conflict-free spanning tree according to a con- 
flict graph G representing conflicts between edges. In addition, the MINIMUM 
CONFLICT-F REE SPANNING TREE (MCFST) problem consists in finding a min- 
imum conflict-free spanning tree (if any) of (G,G) where G is an edge-weighted 
simple graph. Below we formally present both problems. 


CONFLICT-FREE SPANNING TREE (CFST) 

Input: A simple undirected graph G = (V, E), and a conflict graph G = (V,E), 
where V C E(G). 

Question: Is there a spanning tree T' of G such that T’ induces an independent 
set in G? 


MINIMUM CONFLICT-FREE SPANNING TREE (MCFST) 

Input: A simple undirected graph G = (V,F) where each edge e € FE has a 
positive integer weight w(e), and a conflict graph G = (V,E), where V C E. 
Goal: Find (if any) a conflict-free spanning tree of (G,G) which minimizes the 
sum of its edge weights. 


Note that given a simple undirected edge-weighted graph G = (V, EF), anda 
conflict graph G = (V, E) of G, the CONFLICT-FREE SPANNING TREE problem 
consists in determining whether (G, G) admits a feasible solution, while MCFST 
is the natural conflict-free minimization version of MINIMUM SPANNING TREE. 

In [3], the authors proved that CFST remains NP-hard even when the conflict 
graph is a disjoint union of paths of size three (P3’s - paths of length 2). Also, they 
show that if the conflict graph is a disjoint union of paths of size 2 (P2’s -paths 
of length 1), then MCFST becomes polynomial-time solvable. In addition, when 
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the underlying graph G is a cactus it holds that CFST (the feasibility problem) 
is polynomial-time solvable, but MCFST (the optimization version) is still WP- 
hard [16]. Another interesting result obtained by [16] is that if the conflict graph 
is a cluster graph (disjoint union of cliques), then MCFST can be solved in 
polynomial-time. In addition, a unifying model for locally constrained spanning 
tree problems was presented in [5]. 

In this paper, we present a multivariate complexity analysis of MCFST by 
considering particular classes of graphs G and G. In particular, we show that 
the problem of determining whether an instance I = (G,G) admits a feasible 
solution is NP-complete even when G is a bipartite planar subcubic graph, and 
G is a disjoint union of paths of size three. Moreover, we show that whether G is a 
complete graph and Gisa disjoint union of stars, then a feasible solution for J = 
(G, G) can be found in polynomial time, while the problem of finding an optimum 
solution still W’P-hard. Finally, concerning MCFST, (in)approximability results 
on complete graphs G, and an FPT algorithm parameterized by the distance to 
F of G are also presented, where ¥ is any hereditary graph class such that 
MCEFST on conflict graphs G € F can be solved in polynomial time. 


2 Preliminaries 


Graphs 


We consider a simple undirected graph G = (V, £) (or just a graph, for short) as 
a pair of sets such that V(G) is the set of vertices and E(G) is the set of edges, 
where each edge connects a distinct pair of vertices. When two vertices share an 
edge they are “adjacent” (or “neighbors”). The number of neighbors of a vertex 
v is called the degree of v. An induced subgraph of a graph G is another graph, 
formed from a subset X of V(G) and all of the edges of G connecting pairs of 
vertices in X. For X C V(G), we denote by G[X] the subgraph of G induced 
by X. A set X C V(G) is an independent set of G if GLX] does not contain any 


n2—n 
2 


edge. A complete graph, Ky, is a simple graph with n vertices and edges. 
A clique of a graph G is a subgraph of G that is complete. A path is a sequence 
of vertices that do not repeat any vertex, and each pair of consecutive vertices 
in the sequence are adjacent in the graph. A cycle is a path where the first and 
last vertex of the sequence are also adjacent. Vertices vj and v2 are connected if 
there is a path beginning with v, and ending with v2. A graph G is connected if 
every pair of vertices in G is connected. A graph that is not connected is called 
disconnected. A tree is a connected graph having no cycle. A star is a tree having 
a vertex v called center (or root) such that all remaining vertices are adjacent to 
v. A connected component of a graph G is a maximal connected subgraph of G. If 
each connected component of G is a clique then G is a cluster graph. We named 
G[X] as a block of G if G[X] is connected, there is no v € X such that GLX \ {v}] 
is disconnected, and X is maximal with respect to such properties. A graph G 
is subcubic if each vertex of G has degree at most three. An induced path of a 
graph G is a path that is an induced subgraph of G, i.e., nonadjacent vertices 
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in the sequence are not connected by an edge in G. An induced path with n 
vertices is a P,,. We denote by E, the set of edges incident to v. A spanning tree 
T of a graph G is a subgraph of G that is a tree and contains all vertices of G. 

Given a graph G = (V, E) where each edge e € EF has a positive integer weight 
w(e), the MINIMUM SPANNING TREE problem consists in finding a spanning tree 
with the minimum cost, where its cost is the sum of its edge weights. The conflict 
graph of an instance G of MINIMUM SPANNING TREE is denoted by G and its 
vertex set is formed only by edges of G. 


SAT and Its Variants 


Given a CNF Boolean formula F’, the SAT problem consists of determining if 
there exists an assignment to the variables of F that satisfies the formula. 35 AT 
is the particular case of SAT where each clause has at most three literals, and 
its subcase where each variable occurs at most three times is denoted by 3SAT3. 
3SAT as well as 3SATs are classical NP-complete problems extensively used in 
NP-hardness proofs. 

Let F be a CNF Boolean formula, and let Br be the bipartite incidence 
graph where the vertices represent the variables and clauses of F’, and the edges 
represent the occurrence of variables in clauses. When the instances are restricted 
to formulas having a planar embedding for their bipartite incidence graphs, these 
particular cases are called PLANAR SAT, PLANAR 3SAT, and PLANAR 3SAT3 
respectively, and all remain NP-complete. 

It is useful to consider additional constraints concerning the bipartite inci- 
dence graph of the formulas to prove the NP-hardness of problems in special 
graph classes. In addition to the constraint of being planar, if we add a Hamil- 
tonian cycle that goes through all the variable nodes, and the incidence graph 
is still planar, then we have a Var-Linked Planar instance. Besides, if all vari- 
able nodes can be drawn as straight line segments in the x axis and all clauses 
can be drawn as horizontal lines, connected to the variable nodes by at most 3 
vertical segments also known as three-legged embedding [6,12], then we have a 
Rectilinear Var-Linked Planar 3S AT instance. 

Tippenhauer [14] showed that VAR-LINKED PLANAR 3SAT is equivalent to 
RECTILINEAR VAR-LINKED PLANAR 3SAT, and from a planar embedding of 
the incidence graph Br a rectilinear var-linked planar embedding can be easily 
drawn for By. Thus, whenever convenient, we can assume rectilinear embeddings 
for instances of VAR-LINKED PLANAR 3SAT. 

In [10], it was shown that VAR-LINKED PLANAR 3SAT is VP-complete even 
when all variables occur exactly three times with two positive and one negative 
occurrence, and no variable appears more than once in the same clause. This, 
together with the equivalence provided by [14], implies that RECTILINEAR VAR- 
LINKED PLANAR 3SAT3 is NP-complete. 


3 Computational Complexity 


In this section, we present some new theoretical results concerning CFST. 
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NP-completeness 


Using a reduction from RECTILINEAR VAR-LINKED PLANAR 3SAT3 we show 
that CONFLICT-FREE SPANNING TREE (CFST) remains NP-complete even 
when the input graph G is bipartite planar and subcubic, and the conflict graph 
Gisa disjoint union of induced paths of size three. 


Theorem 1. CONFLICT-FREE SPANNING TREE is NP-complete even when G 
is a bipartite planar subcubic graph, and the conflict graph G is a disjoint union 
of P3’s (induced paths with three vertices). 


Proof. It is easy to see that CONFLICT-FREE SPANNING TREE is in VP. There- 
fore, it is enough to show the N’P-hardness. Let F' be an instance of RECTI- 
LINEAR VAR-LINKED PLANAR 3SAT3 where all variables occur exactly three 
times with one negative and two positive occurrences. From F' we construct an 
instance (G, G) of CONFLICT-FREE SPANNING TREE as follows. 

Let Br be the bipartite incidence graph of F’. Considering a rectilinear var- 
linked planar embedding of By, we use such an embedding of By as support to 
construct G. 


1. For each clause C;, create a vertex c; in G and position it in the plane in the 
position corresponding to C; in the embedding of Br; 

2. For each variable x;, create three vertices x;, 7; and £; in G. To position these 
three vertices, we consider the relative position between the edges with an 
endpoint in x; € Bp. If, from the left to the right, the edge representing 
the negative connection appears first, in the middle, or last, we position over 
the X-axis the vertices in the sequence “%;,x;, x”, “x;, Tj, vi”, or “x;, xh, 
z;”, respectively. The relative position between the edges always must be 
maintained by the vertices, and we also add an intermediate vertex between 
any pair of consecutive vertices as in Fig. 1; (The intermediate vertices make 
the resulting graph bipartite.) 


Li ——- Oxi 


Fig. 1. Vertex gadget 


3. If the edge (Cj, z;) is in Br, we add in G the equivalent edge, respecting the 
connection in F’ and the relative position in the rectilinear var-linked planar 
embedding of Bp; 

4. Create an induced path P in G connecting all x;,2/, Z; and intermediate 
vertices created over the X-axis by adding edges between the consecutive 
vertices. Note that P is similar to the straight line crossing the variables in 
the rectilinear var-linked planar embedding of Br. 
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5. The conflict graph G is obtained by creating one vertex by each edge repre- 
senting an occurrence of a literal in a clause. And then, for each 7 from 1 to 
n, we add a P3 representing (2;,c;) conflicting with (%;,c,), and (aj, cy) con- 
flicting with (Z;, cx), where C;, is the clause containing the literal Z;, and Cj, 
C; are the clauses containing the first and second occurrences of the literal 
xj, respectively. 


Figure 2 shows an instance of CFST obtained by the steps described above. 
The square and round shapes of the vertices illustrate the bipartition of V(G). 


f 
#1, 1) os Z1,c4) 
21, C2) 
X2,C1) = = . 
2, C4) £2, ¢3) 
£3, C2) ss = 
£3, C3) B5;@1) 
C3 
@4,C 
ms p 3) = £1, C2) 
a “4, Ca) 


Fig. 2. Instance of CONFLICT-FREE SPANNING TREE obtained from F' = (a1 + x2 + 
£3)(a1 + £3 + Ea) (Z2 + 03 + wa)(Z1 + 224+ wa). 


Since each clause has at most three variables and each literal occurs at most 
twice, it is easy to see that the resulting graph G has maximum degree equal to 
three. In order to observe that G is planar (see step 2), it is enough to note that 
Br is planar, and each x; in Br has been replaced by a gadget that preserves 
the planarity. . 

Now, it remains to show that F is satisfiable if and only if (G,G) is a “yes”- 
instance of CONFLICT-FREE SPANNING TREE. Let n be the number of variables 
of F, and let m be the number of clauses of F’. 

Let A be a satisfying assignment for F’. As G has 6n + m — 1 vertices, we 
must select at least 6n + m — 2 edges forming a connected subgraph of G and 
an independent set in G. Note that it is sufficient to find a connected spanning 
subgraph of G in which the edges induce an independent set in G (in this case, 
the tree can be easily computed). Let X be the set of variables from F, and 
consider that T initially contains all vertices of G and no edge. From A we 
obtain the edge set to be added in T as follows. 


— Add all edges incident on two vertices over the X-axis in T. 
— If x; € X has true value in A, add the edges incident to x; and 2} in T. 
Otherwise, add in T the remaining edge of Z;. 
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Since A is a satisfying assignment, every vertex c; has at least one edge added 
to T. Thus, by construction, T is a conflict-free connected spanning subgraph of 
G, which is enough to our proof. 

Now, let T be a conflict-free spanning tree of G. Without loss of generality, 
we can assume that T contains P. Therefore, we form a satisfying assignment 
A for F as follows. If the vertex Z; in T has a neighbor in {ci,...,c,}, then 
set x; equals false, otherwise set x; equals true. Since all vertices c; has some 
neighbor in TJ’, and edges representing occurrences of x; and 2; are conflicting 
edges, it holds that every clause of F’ has at least one literal evaluated as true 
by A, which is a consistent assignment due to the construction of G. 

Therefore, CONFLICT-FREE SPANNING TREE is NP-complete even on 
instances (G, G) where G is a bipartite planar subcubic graph, and G is a disjoint 
union of P3’s. 


Next, we present some remarks regarding complete graphs. 


Theorem 2. Let F be a family of graphs. If CONFLICT-FREE SPANNING TREE 
is NP-complete when restricted to instances with conflict graph in F, then the 
MINIMUM CONFLICT-FREE SPANNING TREE problem is NP-hard when G is a 
complete graph and G € F. 


Proof. Let I = (G,G) be an instance of the CONFLICT-FREE SPANNING TREE 
problem, where |V(G)| = n and |E(G)| = m. From I = (G,G) we create an 
instance I’ = (G’,G’) of MCFST such that I has a feasible solution if and only 
if J’ has a spanning tree with cost n — 1. First, G’ starts as a copy of G, so we 
define a weight function w on the edges. For each e € E(G’), we set w(e) = 1, 
after that, for each edge f ¢ E(G) we add f to E(G’) with w(f) = 2. Finally, G’ 
is defined as a copy of G. At this point, it is easy to see that J has a conflict-free 
solution if and only if J’ has an optimal solution of cost n — 1. That means the 
optimal solution of I’ does not contain any edge of cost 2. Thus, since G'isa 
copy of G, the claim holds. 


Corollary 1. MINIMUM CONFLICT-FREE SPANNING TREE is strongly NP-hard 
on complete graphs G even when the conflict graph is a disjoint union of paths 
of size three. 


Proof. Let I = (G,G) be an instance of CFST with G being a disjoint union of 
P3’s. By Theorem 1, we know that deciding if J contains a conflict-free spanning 
tree is an V’P-complete problem. Let F3 be the family of graphs formed by the 
disjoint union of P3’s. By Theorem 2, we can construct a complete graph G’ 
coupled with G, such that, J € CFST if and only if the minimum conflict-free 
spanning tree of (G’,G) has cost (|V(G)| — 1). Since the instance weights are 
constants, the strongly \VP-hardness holds. 


Note that Corollary 1 is regarding MCFST. At this point, the reader may 
be asking about the problem of finding a feasible solution when G' is a complete 
graph and G isa disjoint union of paths of size three. Next, in contrast with 
Corollary 1, we show that finding a feasible solution can be done in polynomial 
time in this case. 
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A Polynomial-Time Solvable Case 


Recall that a P3 is a star with three vertices, and CFST is NP-complete even 
when G is a disjoint union of paths of size three. Next, we consider the CFST 
problem when G is a complete graph and G is a disjoint union of stars. 


Theorem 3. If G is a complete graph and G isa disjoint union of stars, then 
I =(G,G) admits a conflict-free spanning tree. 


Proof. If n = |V(G)| < 2 then the claim holds. 

Now, assume that the claim holds for every possible instance I = (G,G) of 
CFST such that G is a complete graph with n < k vertices (i.e., G = K,) and 
Gisa disjoint union of stars. 

Let J = (Kpe4i,G) be an instance where V(Kx41) = {v1,v2,---,Uk+1} and 
G isa disjoint union of stars. By induction, it remains only to prove that I = 
(Ki.41,G) also admits a conflict-free spanning tree. 

Let I’ = (Kx, G’) be the instance obtained by removing the vertex vz41 and 
its edges from Ky41 and G. Let Ey, ,, = {(U1, Ue+1), (v2, Ue+1), «+5 (Uk, VE+1) t 
be the set of edges incident to vz41 in the K,41 graph. Note that G=G- Ey 
remains a disjoint union of stars. 
If E,,,, is an independent set in G then T = ({v1, v2,-.., e421}; Bugs.) 18 & 
conflict-free spanning tree of I, because Ky,41 is a complete graph (so, vpg4i is a 
universal vertex). Conversely, if E,,,,, contains a pair ec, e, of conflicting edges, 
it holds that either eg or e, has degree one in G, because G is a disjoint union of 
stars. Let eg be such an edge. By hypothesis, J’ admits a conflict-free spanning 
tree T’. Since eg has degree one in G and it conflicts with €z, it holds that eg 
has no conflict with any e € E(T") then by adding vz, 41 and eg in T’ we obtain 
a conflict-free spanning tree T for I = (Ky41,G). 


The constructive proof of Theorem 3 suggests a polynomial-time algorithm 
to obtain a conflict-free spanning tree when G is a complete graph and G is a 
disjoint union of stars. 


Corollary 2. Let I = (G,G) be an instance of MINIMUM CONFLICT-FREE 
SPANNING TREE where G is an edge-weighted complete graph and G isa disjoint 
union of stars. It holds that a feasible solution for I = (G, G) can be found in 
O(n +m) time, where n = |V(G)| and m = |E(G)|. 


Proof. Let I = (G,G) be an instance of MCFST where G is an edge-weighted 
complete graph and G is a disjoint union of stars. Let V(G) = {v1,v2,...,Un}, 
and let G, be the subgraph of G induced by {v1, v2,...,vx%}. According to the 
proof of Theorem 3, for & from 2 to n we obtain a conflict-free spanning tree for 
G;, by either using the edges incidents to vz; in G, or adding an edge from vz in 
a solution of Gz_1. 

Since G is a disjoint union of stars, in O(n + m) time, we can identify and 
store in a list £ the information whether the edges incident to vg in Gy, form an 
independent set. Note that whenever such a set of edges is not an independent 
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set, there is an edge (vz, v;) that is the center of a star S of G and is adjacent to 
an edge (vx, U;) that is a leaf of S, where i, 7 < k. Thus, in O(n+m) time, we can 
either store the leaf (edge (vz, vj)) to be aggregated together with a solution of 
G,-1, or store a information that {(v1, ug), (v2, Uk),---, (Uk—-1, Ue) } is a solution 
for G,. After this preprocessing, We can, in linear time, traverse the stored list 
£ from position n to 1 recovering the edges of a conflict-free spanning tree of 
G=Gn. 


Approximation Results 


By Corollary 1, MINIMUM CONFLICT-FREE SPANNING TREE is strongly N’P- 
hard on complete graphs G even when the conflict graph is a disjoint union 
of paths of size three. However, by Corollary 2, concerning such instances of 
MCFST, a feasible solution can be found in O(n+m) time. This motivates the 
study of the approximability in such a particular case of MCFST. 


Theorem 4. MINIMUM CONFLICT-FREE SPANNING TREE on instances I = 
(G,G) such that G is a complete graph and G is a disjoint union of stars, 
admits a c-approximation algorithm where c is the maximum weight among the 
edges of G. In addition, considering the same constraints, it does not admit a 
(c — 1)-approzimation algorithm unless P = NP. 


Proof. Let I = (K,,,G) be an instance of MCFST where K,, is a complete graph 
with positive weight on the edges and G is a disjoint union of stars. 

To prove the first statement, we use the algorithm of Theorem 3. Let c = 
max{w(e) | e € E(K,,)}. By Theorem 3, we can obtain in polynomial time a 
feasible solution T. In the worst case, all edges in T’ have a weight equal to 
c, so that w(T) = c(n — 1), but the optimal solution must have cost at least 
n—1, because the weights are positive integers. Therefore, T is a c-approximate 
solution for the problem. 

Now, to prove the second claim, suppose that there is a (c— 1)-approximation 
algorithm for the problem. Thus, we can run this (c — 1)-approximation algo- 
rithms in the instances J = (K,,G) where K,, is a complete graph with 
w(e) € {1,2},V e € E(K,) and G is a disjoint union of P3’s. Note that on 
such instances, since c = 2, the (c — 1)-approximation algorithm must solve 
them in polynomial time. However, by Corollary 1, MCFST remains NP-hard 
even restricted on such instances. Therefore, the existence of such a (c — 1)- 
approximation algorithm would imply that P = NP. 


Theorem 5. MINIMUM CONFLICT-FREE SPANNING TREE on complete graphs 
G with the conflict graph G being a disjoint union of P3’s is a exp-APX-complete 
problem. 


Proof. To prove the hardness we modify the construction of the proof of Theo- 
rem 2. Let n be the input size of CFST. Instead of adding edges f with weight 
w(f) = 2, we assign w(f) = 2". Note that encoding 2" can be done using n bits. 
This implies that, if there is a non-exponential approximation factor algorithm 
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to these modified instances, then the resulting solution does not contain any edge 
with weight 2", solving the original CFST instance in polynomial time, which 
contradicts the fact that CFST is NP-complete when G is a disjoint union of 
P3’s. Thus, this particular case of MCFST is exp-APX-hard. 

Finally, to see that such a particular case is in exp-APX, it is enough to 
consider the c-approximation algorithm of Theorem 4. 


FPT Algorithm 


Throughout this paper, the results show that even if the conflict graph has 
only isolated P3’s, the CFST problem remains challenging. In addition, P3’s 
are necessary structures in the conflict graph for the problem to become hard. 
Recall that a cluster graph is a graph formed from the disjoint union of complete 
graphs. It is well known that a graph is a cluster graph if and only if it has no 
induced path with three vertices, i.e., the class of cluster graphs is exactly the 
class of P3-free graphs. In [16], Zhang, Kabadi, and Punnen showed that if the 
conflict graph is a cluster graph (disjoint union of cliques), then MCFST can 
be solved in polynomial-time. This shows that induced paths with three vertices 
are key structures for the intractability of the problem. 

Besides the existence of P3’s, another structural property necessary in conflict 
graphs G on hard instances of MCFST is the existence of 2K’s. It is well known 
that 2K -free graphs have a polynomial number of maximal independent sets. 
In addition, all distinct maximal independent sets of a 2h -free graph can be 
enumerated in polynomial time [4]. Therefore, it is easy to see that MCFST can 
be solved in polynomial time on 2/2-free conflict graphs G. Recall that 2K5-free 
graphs generalize interesting graph classes such as the class of split graphs. 

At this point, we consider the “distance from trivialities” of the conflict graph 
as structural parameterizations. 

The distance to a graph class F is the minimum number of vertices to be 
removed to obtain a graph in F. A F-vertex deletion set K of a graph H isa 
set of vertices such that H[V \ K] is in ¥. The F-vertex deletion number, also 
known as distance to F, is the minimum cardinality of a set K such that K is 
a F-vertex deletion set. 

Now, let F be a hereditary graph class such that MCFST on conflict graphs 
G € F can be solved in polynomial time. Given a minimum F-vertex deletion set 
K of aconflict graph G, we can design an FPT algorithm to solve the MINIMUM 
CONFLICT-FREE SPANNING TREE problem parameterized by the distance to F 
of the conflict graph G. 

Note that cluster graphs and 2K2-free graphs are particular cases of F. Also, 
notice that CLUSTER VERTEX DELETION and 2K9-FREE VERTEX DELETION 
can be solved in 20(*) - 2° time when parameterized by the solution size (k) 
through the bounded search tree technique. In addition, an 1.811" - n°“-time 
algorithm for CLUSTER VERTEX DELETION can be found in [15]. Therefore, 
we may assume that a minimum ¥-vertex deletion set of G is given when- 
ever it can be computed in FPT time (concerning its size), and the MINIMUM 
CONFLICT-FREE SPANNING TREE problem is parameterized by the distance to 
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F of the conflict graph. Zhang, Kabadi, and Punnen [16] showed that MCFST 
is polynomial-time solvable if the distance to cluster of the conflict graph is 
bounded by a constant. Next, we generalize such a result. 


Theorem 6. Let F be a hereditary graph class such that MCFST on conflict 
graphs G € F can be solved in polynomial time, and let (G,G) be an instance 
of MINIMUM CONFLICT-FREE SPANNING TREE. Given a minimum F-vertex 
deletion set K of G, a minimum conflict-free spanning tree of (G, G) can be 
found (if any) in 2*-n°™ time, where k = |K|. 


Proof. We assume that a minimum F-vertex deletion set K of G is given. 

Moreover, we named by F-Algorithm a polynomial-time algorithm to solve 
MCEFST on conflict graphs GeF. Next, we show the pseudocode of our FPT- 
MCFST algorithm. We use Ncg(v) to indicate the (open) neighborhood of v in 
the graph G, and use Ng[v] to denote its closed neighborhood. 


1 FPT-MCFST(G,G, K); 

if |K| =0 then 

return F-Algorithm(G,G) 

else 

Take an element e = (u,v) of K 

K — K \ fe} 

(V(G"), E(G’)) — (V(G), E(G) \ {e}) #edge e is not in the solution 
GeO —Jel 

(V(G"), E(G")) — (V(G), E(G) \ Ne(e)) #edge e is in the solution 
K" — K\ Neale) 

GleGe Nelel 

return min{FPT-CFST(G’, G’, K), FPT-CFST(G”, G", K")} 

end 


oma Noun FW WN 


BE Re oR 
bu oF oO 


Bb 
wo 


Algorithm 1: Parameterized algorithm for MCFST. 


When || = 0, the resulting conflict graph is in F, and the problem can be 
solved in polynomial-time by the F-algorithm that we assume exists. Therefore, 
Algorithm 1 works as a bounded search tree using the F-vertex deletion set K 
to branch into two branches that distinguish regarding the potential use or not 
of an element e € K in the solution. If we use e in the solution, we should delete 
the elements in Ng(e) (the neighborhood of e in G) from both G and G. In 


addition, we also remove e from G because isolated vertices are irrelevant in 
conflict graphs. On the other hand, if e is not in the solution, we should remove 
e from G and G. Note that e is removed from G in both cases. This procedure 
is applied recursively until |A] = 0, resulting in a conflict graph in F because 
F is a hereditary graph class. Note that we don’t need to fix a particular edge 
on a solution; we just need to leave it free to be used if necessary. Finally, when 
the conflict graph becomes a graph in the class F, we can run the F-algorithm 
to solve the resulting instance. Since the FPT-MCFST algorithm constructs a 
bounded search tree with k levels, we have a 2" -n°()-time algorithm. 
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As a corollary, it follows that MCFST can be solved in FPT time when 
parameterized by the distance to 2K 2-free graphs or the distance to P3-free 
graphs. We left as an open problem the complexity of the problem when the 
conflict graph is (A{2 U P3)-free or 2 P3-free. 
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Abstract. Bo-VPG graphs are intersection graphs of axis-parallel line 
segments in the plane. We show that all AT-free outerplanar graphs are 
Bo-VPG. In the course of the argument, we show that any AT-free outer- 
planar graph can be identified as an induced subgraph of a 2-connected 
outerplanar graph whose weak dual is a path. Our Bo-VPG drawing pro- 
cedure works for such graphs and has the potential to be extended to 
larger classes of outerplanar graphs. 


Keywords: Outerplanar - AT-free - Bo-VPG - Grid intersection 
graph - Graph drawing 


1 Introduction 


A k-bend path is a simple path in a two-dimensional grid with at most k bends. 
Geometrically they are polylines in the plane made of at most k+1 axis-parallel 
(horizontal or vertical) line segments. Vertex intersection graphs of Paths on a 
Grid (VPG) (resp., By-VPG) is the class of graphs which can be represented 
as intersection graphs of simple (resp., k-bend) paths in a two-dimensional grid. 
The bend number of a graph G in VPG is the minimum & for which G is in 
B,-VPG. VPG graphs are equivalent to string graphs which are intersection 
graphs of curves in the plane. One motivation to study B,-VPG graphs comes 
from VLSI circuit design where the paths correspond to wires in the circuit. A 
natural concern in VLSI design is to reduce the number of bends in each path 
(wire) in the representation. A second motivation is that certain algorithmic 
tasks become easier when restricted to B1-VPG or Bo-VPG graphs (cf. [21]). 
Some of the standard graph theoretic terminology used in the rest of the 
introduction are defined in Sect. 1.2. Planar graphs have received the maximum 
attention from the perspective of B,-VPG representations, some of which we 
describe in Sect. 1.1. Following up on a series of improvements and conjectures 
by various authors, Goncalves, Isenmann, and Pennarun in 2018 showed that all 
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planar graphs are By-VPG [17]. This is tight since many simple planar graphs 
like 4-wheel, 3-sun, triangular prism, to name a few, are not Bo-VPG. This 
makes the question of characterizing Bo-VPG planar graphs very appealing. 
Even though recognizing Bo-VPG graphs is NP-complete in general, we do not 
know the recognition complexity of the same question restricted to planar graphs. 
Segment intersection graphs are intersection graphs of line segments in the plane. 
Chalopin and Gongalves in 2009 showed that every planar graph is a segment 
intersection graph [7], confirming a conjecture of Scheinerman from 1984 [24]. 
One way to refine the class of segment intersection graphs is to restrict the 
number of directions permitted for the segments. If the number of directions is 
limited to two, we rediscover Bp-VPG. k-DIR graphs are intersection graphs of 
line segments that can lie in at most k directions in the plane. It is known that 
bipartite planar graphs are 2-DIR [12,13,18] and triangle-free planar graphs are 
3-DIR [4]. West conjectures that any planar graph is 4-DIR [25]. This adds to 
the appeal for characterizing Bp-VPG planar graphs. 

Characterizing outerplanar Bo-VPG graphs will be a good step towards the 
above since some of the structures that forbid a planar graph from having a 
Bo-VPG representation are also present among outerplanar graphs. Outerpla- 
nar graphs were known to be B;-VPG [5] before the same was shown for planar 
graphs. In this article we take a small step towards this goal by showing that 
AT-free outerplanar graphs are Bo-VPG. To do so, we first show that any AT- 
free outerplanar graph can be identified as an induced subgraph of a 2-connected 
linear outerplanar graph. We call a 2-connected outerplanar graph linear if it’s 
weak dual is a path. This result may be of independent interest. Our Bo-VPG 
drawing is essentially for 2-connected linear outerplanar graphs and has poten- 
tial to be extended to larger classes of outerplanar graphs. However, we cannot 
extend this result to AT-free planar graphs since we have examples of AT-free 
planar graphs, like 4-wheel and triangular prism, which are not Bo-VPG. 

After a brief literature review in Sect. 1.1 and a recall of some standard graph 
theoretic terminology in Sect. 1.2, we spread the proof of our main result in two 
sections. In Sect. 2 we introduce a subclass of outerplanar graphs called linear 
outerplanar graphs and show that every AT-free outerplanar graph is linear. 
Further, we show that every linear outerplanar graph is an induced subgraph 
of a 2-connected linear outerplanar graph. In Sect.3, we show that every 2- 
connected linear outerplanar graph is Bo-VPG. Since Bo-VPG is easily seen to 
be a hereditary graph class (closed under induced subgraphs), it follows that all 
linear outerplanar graphs are Bo-VPG. Furthermore, since we have shown that 
all AT-free outerplanar graphs are linear the main result of this article follows. 


1.1 Literature 


The class B,-VPG was introduced by Asinowski et al. in 2012 [2]. Neverthe- 
less, these graphs were previously studied in various forms. One of them is grid 
intersection graphs (GIG) which are bipartite graphs that can be represented as 
intersection graphs of horizontal and vertical line segments in the plane where 
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no two parallel line segments intersect [18]. It is easy to see that the class of 
bipartite Bo-VPG graphs and GIGs are equivalent. 

Apart from the celebrated result by Goncalves et al. that planar graphs are 
B,-VPG [17], we can see a chronology of results of By,-VPG in planar graphs. 
Hartman et al. proved that bipartite planar graphs are GIG [18] and hence 
Bo-VPG. In [2], Asinowski et al. showed that planar graphs are B3-VPG and 
conjectured that it is tight. Disproving this conjecture, Chaplick and Ueckerdt 
proved that planar graphs have a Bj-VPG representation [10]. Biedl and Derka 
further improved the result which proves that planar graphs have 1-string B2- 
VPG representation which means that any two paths can intersect at most once 
[3]. Francis and Lahiri proved that any plane graphs, formed by connecting the 
leaves of a tree making only simple cycles, have a B1-VPG representation [15]. 

The recognition problem for string graphs and hence VPG graphs is NP- 
complete [19,23]. The recognition problem for 2-DIR graphs and hence Bo-VPG 
graphs is NP-complete [20]. The recognition of whether a given graph is in B;- 
VPG, k > 0, is NP-complete even if it is guaranteed that the given graph is in 
Br+i-VPG [9]. The same article also shows that the classes By-VPG and Bz4+1- 
VPG , k > 0, are separated. Chaplick et al. left the possibility of such a sepa- 
ration in chordal graphs open [9] which was answered partially by Chakraborty 
et al. [6]. Cohen et al. showed the existence of a cocomparability graph with 
bend number & for all & > 0 (Theorem 3.1 in [11]). Chaplick et al. provided 
a polynomial time decision algorithm for Bop-VPG chordal graphs in 2011 [8]. 
Bo-VPG characterizations are known for block graphs [1], split graphs, chordal 
bull-free graphs, chordal claw-free graphs [16] and cocomparability graphs [22]. 


1.2 Terminology and Notation 


The closed neighborhood N{v| of a vertex v in a graph is the set containing v 
and its neighbors in G. We will refer to vertices of a graph with k neighbors as 
k-degree vertices. A graph G is H-free if G does not contain an induced subgraph 
isomorphic to H. We use C;, to denote the simple cycle on k vertices. A cycle on 
k vertices x,...,@,—1 where each 2; is adjacent to x;4, (addition is modulo k) 
can also be denoted as %0,...,U—1, Uo. A Cx together with an additional vertex 
v adjacent to all the vertices of the cycle is called a wheel graph, denoted as Wy. A 
triangular prism is the complement of Cg. A triangle vo, v1, v2, vo together with 
an independent set S = {uo, ui, U2} is called a 3-sun if Vi € {0,1,2}, N[ui] = 
{uj, vj, Vi41} where addition is modulo 3. 

A set of three independent vertices is called an asteroidal triple or AT when 
there exists a path among each pair of them containing no vertex from the closed 
neighborhood of the third vertex. An AT-free graph is a graph which does not 
have an AT. A subset of vertices in a graph is called a separator if its removal 
increases the number of components of the graph. A vertex x is a cutvertex if 
{x} is a separator. A graph is k-connected if it does not have a separator of size 
smaller than k. A graph is connected if it is 1-connected. A block of a graph 
is any maximal 2-connected subgraph of the graph. A trivial block is a block 
containing at most two vertices. 
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A plane graph is an embedding of a planar graph in the plane with no crossing 
edges. A face in a plane graph is any region in the plane bounded by edges in the 
graph and not containing any other vertex or edge. Among the faces contributed 
by a plane graph in the plane, exactly one is an unbounded (outer) face and the 
remaining are bounded (inner) faces. The dual graph H of a plane graph G is a 
graph that has a vertex for each face of G and an edge between two vertices in 
H if the corresponding faces of G share an edge. The weak dual of a plane graph 
G is an induced subgraph of its dual whose vertices correspond to the bounded 
faces of G. A boundary edge in a plane graph is an edge that is shared by the 
unbounded face of the graph. Edges shared only by bounded faces are called 
internal edges. A planar graph is called outerplanar if it has a plane embedding 
in which all the vertices are incident on the outer face. Outerplanar graphs will 
always be drawn in such a way that the outer (unbounded) face contains all the 
vertices and hence the above terminology of faces, duals, weak duals, boundary 
edges and internal edges will be used assuming such a plane drawing. 


2 Linear Outerplanar Graphs 


In this section, we introduce a subclass of outerplanar graphs called linear out- 
erplanar graphs (Definition 2) and show that every AT-free outerplanar graph 
is linear. Further, we show that every linear outerplanar graph is an induced 
subgraph of a 2-connected linear outerplanar graph. Though the definition of 
general linear outerplanar graphs is technical, 2-connected linear outerplanar 
graphs turn out to be exactly those 2-connected outerplanar graphs whose weak 
dual is a path. This simplicity is exploited in the next section to find a Bp- VPG 
representation for them. 


Definition 1. Let G be an outerplanar graph. A face of G which corresponds 
to a leaf of the weak dual is called a leaf face. A block B of G is called safe if B 
has at most two cutvertices and if B contains more than one face, then to each 
cutvertex x in B we can associate a different leaf face of B containing x. A block 
B of G incident to a cutvertex x is called big for x if either B has a cutvertex 
other than x or a face not containing x. A cutvertex x of G is called safe if at 
most two blocks of G incident to x are big. 


Definition 2 (Linear Outerplanar Graph). An outerplanar graph G is 
called linear if the weak dual of G is a linear forest, and every block and cutvertex 
of G are safe. 


We start with some observations that will help us prove that AT-free outer- 
planar graphs are linear. 


Observation 1. [fv is a 2-degree verter in a 2-connected outerplanar graph, 
then N|v] is not a separator of G. 


Proof. Since degree of v is 2, v and its two neighbors are three consecutive 
vertices in the outer face of G. Removing three consecutive vertices from the 
outer cycle of G does not disconnect G. 
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Fig. 1. A linear outerplanar graph with three cutvertices x, y, z and five blocks. Notice 
that all blocks and cutvertices are safe. 


The following list of observations are immediate consequences of applying 
the above observation to a block of an outerplanar graph and the trivial fact 
that a single vertex does not separate a 2-connected graph. 


Observation 2. Let B be a block of an outerplanar graph G. A 2-degree vertex 
in B is a vertex in B whose degree inside B is 2. For every cutvertex x in B, 
we will denote a neighbor of x outside B by x’. 


(i) Ifa,b,c are three pairwise non-adjacent 2-degree vertices in B, then {a, b, c} 
forms an AT in G. 
(ti) If a and b are two non-adjacent 2-degree vertices in B and c is a cutverter 
in B non-adjacent to a and b, then {a,b,c'} forms an AT in G. 
(itt) If a and b are two cutvertices in B and c is a 2-degree vertex in B non- 
adjacent to a and b, then {a’,b',c} forms an AT in G. 
(iv) If a,b,c are three cutvertices in B, then {a’,b',c'} forms an AT in G. 


Observation 3. Let B be a block of an outerplanar graph. Every leaf face of B 
contains a 2-degree verter in B which is not incident to any other bounded face 
of B. 


Proof. A face has at least three vertices and at most two vertices of a leaf face 
can be shared by another face. Hence all the remaining vertices are 2-degree 
vertices in B not incident to any other bounded face of B. 


Theorem 1. AT-free outerplanar graphs are linear. 


Proof. Let G be an AT-free outerplanar graph. To prove that G is linear, we 
need to show that G satisfies three conditions. 


1. Weak dual of G is a linear forest. Let W be the weak dual of G. Since G is 
outerplanar, W is a forest [14]. If W is not a linear forest, then, it has a vertex 
of degree 3 (or more). Let that vertex be f and three of its neighbors be ny, 
ng, and n3. Let the corresponding faces in G be F for f, and N; for n;, 7 € [3]. 
Consider the induced subgraph H formed by vertices of Ff, Ny, No, and Ns. It 
is easy to see that H is a 2-connected outerplanar graph where Nj, No, and N3 


108 S. Jain et al. 


are leaf faces in it. For each i € [3], let a; be a 2-degree vertex which belongs 
exclusively to N; (Observation 3). By Observation 2(i), we can conclude that 
{a ,@2,a3} forms an AT in H and hence in G. Therefore W is a linear forest. 


2. Every block of G is safe. Let B be a block of G. Suppose B has at least 3 
cutvertices a, b, and c. Then by Observation 2(iv), G has an AT. Let B bea 
block of G with at most two cutvertices but more than one face. This implies 
that B has at least two leaf faces, Ff, and F2. If B has a cutvertex which is 
neither incident to F, nor Fy, then by Observations 3 and 2(ii), we have an 
AT in G. Hence every cutvertex of B is incident to a leaf face of B. If B has 
two cutvertices x and y and neither of them is incident to one of the leaf faces, 
then by Observations 3 and 2 (iii), we have an AT in G. Hence x and y can be 
associated with two leaf faces containing them. Hence B is safe. 


3. Every cutvertex of G is safe. Let x be a cutvertex in G. Suppose x has at 
least 3 big blocks B,, Bo, and Bs incident to it. For each 7 € [3], since B; is 
big, it either contains a face F; not containing x or another cutvertex x; and 
another block B; incident to x;. In both cases we can find a vertex y; (in F; or 
Bi respectively) which is at a distance 2 or more from a. One can easily see that 
{y1, yo, y3} forms an AT in G. Hence z is safe. 


We end this section by showing that every linear outerplanar graph is an 
induced subgraph of a 2-connected linear outerplanar graph. Figure 2 shows a 
2-connected linear outerplanar graph which contains the graph in Fig.1 as an 
induced subgraph. 


Lemma 1. Every connected linear outerplanar graph is an induced subgraph of 
a 2-connected linear outerplanar graph. 


Proof. Let G be a connected linear outerplanar graph. Let B, and Bz be any two 
blocks of G joined by a cutvertex xv. For each i € {1,2}, we choose a neighbor 
y; of x from B; as follows. If B; is a single edge or a single face D;, then any 
neighbor of x in B; can be chosen as y;. If B; has more than one face, then we 
have a leaf face L; of B; associated to x. Pick y; to be a neighbor of x from LD; 
such that xy; is a boundary edge of L;. Let the supergraph of G obtained by 
adding a new vertex x’ adjacent to y; and yo be called G’ and the new block 
containing B,, Bz and 2’ be called B’. 

Since x is a cutvertex and ry1,xry2 are boundary edges of G, the block B’ 
remains outerplanar. Let the weak duals of B,; and Bz respectively be (possibly 
empty) the paths F,,..., Fe—1 and Py4i,..., Fy with Fe_1 = Ly and Fy41 = Le 
when they are non-empty paths. Then the path F\,..., Fx_-1, Pe, Frii,..., Fr is 
the weak dual of B’ where Fy, corresponds to the new face x, 1,2’, yo, x. This 
holds true even if By is trivial (& = 1) or By is trivial (k = 1), or both. Hence G’ 
is outerplanar and its weak dual is a linear forest. We argue next that the graph 
G’ is linear outerplanar in two special cases. Since the blocks of G’ other than 
B’ and cutvertices of G’ not in B’ remain safe in G’, it suffices to argue that 
the block B’ and the cutvertices in it are safe in G’. Moreover, any cutvertex x’ 
of B’ other than z is safe since the number of big blocks incident to x’ does not 
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increase due to the merging of B; and By. This is because if B;, i € {1,2}, has 
two cutvertices, then B; is already big for both. Hence it suffices to show that 
the block B’ and the vertex x (if it remains a cutvertex in G’) are safe. 

Since x is a safe cutvertex in G, at most two blocks of G incident to x are big. 
Thus we can always find two blocks B, and Bg incident to x such that either 
By is not big for x or By, Bz are the only two blocks incident to x. Hence the 
following two cases are exhaustive. 


Case 1 (Bg is not big for x). Since By does not have any cutvertex other than z, 
B' has at most two cutvertices in G’, possibly x and another vertex x, from By. 
If B, has more than one face, then since x is associated with F,_1, x; belongs 
to F,. If By is a single edge or a single face, then x, still belongs to F, in B’. 
Since Bg is not big for x, F; contains x. Hence we can associate x; to F, and x 
to F; in B’ and conclude that B’ is safe. One can verify that B’ is big for x in 
G’ only if B, is big for xz in G. Hence z is safe in G’. 


Case 2 (By and Bg are the only two blocks incident to x). In this case, x is no 
more a cutvertex in G’. Hence B’ has at most two cutvertices, possibly x, from 
B, and x2 from Bz. One can verify, as in the first case, that x; belongs to F} 
whether B, is trivial or not and x2 belongs to F; whether Bz is trivial or not. 
Hence B’ is safe in G’. 

Repeating the above merging till no cutvertices are left results in 
a 2-connected linear outerplanar graph which contains G as an induced 
subgraph. 


73s* 
zy 


+, , 
* 2 


Fig. 2. A 2-connected linear outerplanar graph constructed from the graph in Fig. 1 
using the construction procedure employed in the proof of Lemma 1. It can be verified 
that the addition of new vertices x’, y’,z{ and z5 satisfy the cases 1, 2, 1 and 2 in the 
proof respectively. 


Remark 1. We would also like to point out that not all connected AT-free outer- 
planar graphs are induced subgraphs of 2-connected AT-free outerplanar graphs. 
Let G be a Cs together with a pendant vertex. While G is AT-free outerplanar, 
any 2-connected outerplanar graph G’ containing G as an induced subgraph is 
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not AT-free. To see this, consider the induced subgraph H of G’ formed by the 
Cs in G and another face F’ sharing an edge with this C5. We can pick one 
2-degree vertex a from F' and two non-adjacent 2-degree vertices b and c from 
the Cs which are both non-adjacent to a. From Observation 2(i), it follows that 
{a, b,c} is an AT. This is the obstacle which nudged us to study the larger class 
of linear outerplanar graphs, which admittedly is rather technical. 


3 Bo-VPG Representation of 2-connected Linear 
Outerplanar Graphs 


It’s drawing time. In this section, we show that every 2-connected linear out- 
erplanar graph is Bp-VPG (Lemma 2). The proof of Lemma 2 is algorithmic 
which draws a Bo-VPG representation for any 2-connected linear outerplanar 
graph (cf. Fig.3 for example). Since Bo-VPG is easily seen to be a hereditary 
graph class (closed under induced subgraphs), it follows from Lemma 1| that all 
linear outerplanar graphs are By-VPG. Furthermore, since we have shown that 
all AT-free outerplanar graphs are linear (Theorem 1), the main result of this 
article follows. 


Lemma 2. Every 2-connected linear outerplanar graph is Bo-VPG. 


Proof. Let G be a 2-connected linear outerplanar graph with n faces labeled 
F,,...,F, such that the weak dual of G is the path F,,..., F,. For each i € [n— 
1], the edge shared by F; and F4, is denoted by e;. For notational convenience, 
we set e,, to be any boundary edge of F,,. For each i € [n], let G; denote the 
induced subgraph of G restricted to the faces Fi,..., Fj. 

In a Bo-VPG drawing D; of G;, we call a non-point horizontal (resp., vertical) 
line segment / in D; extendable from a point p € 1 if at least one of the two infinite 
horizontal (resp., vertical) open rays starting at p (but not containing p) does not 
intersect any other line segment of D;. A point segment / is said to be extendable 
from its location p if it is extendable from p both as a horizontal and a vertical 
line segment. An edge ry in G; is said to be extendable in D; if the line segments 
1, and |, representing the vertices x and y are extendable from a common point 
p € ly ly either in the same direction or in orthogonal directions. Finally a 
Bo-VPG drawing D, is said to be extendable if e; is extendable and whenever 
F;, is a triangle, the vertex of F; not incident to e;_; is represented by a point 
segment. Note that if F, is a triangle, all the vertices of F, are represented by 
point segments in Dj. 

If the length of F) is 4 or more, then we can represent F; as the intersection 
graph of line segments laid out on the boundary of an axis-parallel rectangle 
with the endpoints of e; being orthogonal (and hence sharing only a corner of 
the rectangle). This is an extendable Bp-VPG drawing D, of G,. If Fy = Cs, 
then representing all the three vertices as point segments at the same point 
gives an extendable Bo-VPG drawing D, of G;. Let D;, 7 <n, be an extendable 
Bo-VPG drawing of G;. From D;, we construct an extendable By-VPG drawing 
Dj41 of Gi+41 as follows. 
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25 22 


(b) The Bo-VPG drawing of the graph depicted in Fig. 2. 


Fig. 3. A 2-connected linear outerplanar graph G and a Bo-VPG representation of it. 
The collinear overlapping line segments are drawn a little apart for clarity. The point 
segments (for e.g. vertices 1 and 25) are drawn as black squares. 


Case 1 (length of Fi41 is 4 or more). Let Fi41 = v0,---;Uk, Vo, With e; = vgv0 
and e€j41 = Uj;Uj41, Jj < k. Since D; is extendable, the edge vzvp is extendable in 
D;. Extend the line segments J, and lo (representing vz and vo respectively) in 
orthogonal directions to two points gq, and qo outside of the bounding box of Dj. 
Let qg be the intersection point of the perpendiculars to 1, and Ip at gq, and qo 
respectively. Represent the path v,,...vz—1 on the two line segments from qo to 
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q and q to gx such that v; is represented by a segment containing qo, vy_1 by a 
segment containing gq, and vj,vj;+41 by orthogonal line segments sharing a point. 
The point shared by these two segments will be gg when 7 = 0, gq, when j = k—1 
and q in all other cases. This gives the drawing Dj4+1. It is clear that the new 
line segments added in this stage do not intersect any other line segments in D; 
except [9 and lx. It is easy to verify that the edge e;,1 = vjvj;+1 is extendable. 
Hence Dj is extendable. 


Case 2 (Fi41 = C3). Let Fi41 = a,b,c, a, with e; = ca and e;+, = ab. Since D; 
is extendable, the edge ca is extendable in D; from a point p. If the line segments 
1. and I, are extendable in the same direction, then extend them to a point q 
outside the bounding box of D; and represent b by a point segment I, at q to 
obtain D;+,1. It is easy to check that the line segment /,, the point segment lp, 
and also the edge ab are extendable from q in Dj. Since ab is extendable and b 
is represented by a point segment, D;+1 is extendable. If. and I, are extendable 
only in orthogonal directions, then neither of them is a point segment. Hence 
F;, # Cs and hence the vertices c and a have no common neighbor in G;. So 
the point p is not contained in any line segment of D; other than I, and lg. 
Represent b by a point segment I, at p to get Dj+1. In both the subcases, it is 
clear that the new line segments added in this stage do not intersect any other 
line segments in D; except J, and l,. It is easy to check that the line segment J, 
the point segment /,, and also the edge ab are extendable from p in Dj; 1. Since 
ab is extendable and b is represented by a point segment, Di+1 is extendable. 
Repeating the above construction n — 1 times gives a Bp-VPG drawing D,, 
of G, =G. 


Since Bo-VPG is easily seen to be a hereditary graph class (closed under 
induced subgraphs), the next theorem follows from Lemmas 1 and 2. 


Theorem 2. Every linear outerplanar graph is By-VPG. 
Together with Theorem 1, the above establishes the main result in this article. 


Theorem 3. Every AT-free outerplanar graph is Byo-VPG. 


4 Concluding Remarks 


Even though we showed that all linear outerplanar graphs and in particular, 
all AT-free outerplanar graphs are Bo-VPG, the characterization of Bo-VPG 
outerplanar graphs still evades us. It is easy to see that linearity is not necessary 
for Bo-VPG outerplanar graphs. Planar bipartite graphs, and hence outerplanar 
bipartite graphs are Bp-VPG [18]. But outerplanar bipartite graphs can be far 
from being linear, in the sense that their weak duals can be trees with arbitrarily 
large degrees for internal nodes. More than our result, we think the drawing 
technique employed in the proof of Lemma 2 might become useful in an attempt 
to characterize By-VPG outerplanar graphs. 


Acknowledgments. We thank K. Muralikrishnan for posing the question of charac- 
terizing Bo-VPG outerplanar graphs. 
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Abstract. We investigate the complexity of finding a minimum Steiner 
tree in new subclasses of split graphs namely tree-convex split graphs 
and circular-convex split graphs. It is known that the Steiner tree prob- 
lem (STREE) is NP-complete on split graphs [1]. To strengthen this 
result, we introduce convex ordering on one of the partitions (clique or 
independent set), and prove that STREE is polynomial-time solvable for 
tree-convex split graphs with convexity on clique (K), whereas STREE is 
NP-complete on tree-convex split graphs with convexity on independent 
set (I). Further, we show that STREE is polynomial-time solvable for 
path (triad)-convex split graphs with convexity on J, and circular-convex 
split graphs. Finally, we show that STREE can be used as a framework 
for the dominating set problem in split graphs, and hence the complex- 
ity of STREE and the dominating set problem is the same for all these 
graph classes. 


Keywords: Steiner tree - Tree-convex - Path-convex - Triad-convex - 
Circular-convex - Domination 


1 Introduction 


The computational complexity of the Steiner tree problem (STREE), Domina- 
tion and its variants for different classes of graphs has been well studied. Given 
a graph G with terminal set R C V(G), STREE asks for a set S C V(G)\ R 
such that the graph induced on S'U R is connected. The objective is to minimize 
the number of vertices in S. STREE is NP-complete for general graphs, chordal 
bipartite graphs [2], and split graphs [3] whose vertex set can be partitioned 
into a clique and an independent set. It is polynomial-time solvable in strongly 
chordal graphs [3], series-parallel graphs [4], and outerplanar graphs [5], and for 
graphs with fixed treewidth [6]. It is known [1] that STREE is polynomial-time 
solvable in Ky ,3-free split graphs and kK, 4-free split graphs, whereas in K1,5-free 
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split graphs, STREE is NP-complete. In this paper, we focus on new subclasses 
of split graphs, and study the tractability versus intractability status of STREE 
in those subclasses of split graphs. 

It is important to highlight that many problems that are NP-complete on 
bipartite graphs become polynomial-time solvable when a linear ordering is 
imposed on one of the partitions. Such graphs are known as convex bipartite 
graphs in the literature. For example, the dominating set problem is NP-complete 
on bipartite graphs, whereas it is polynomial-time solvable in convex bipartite 
graphs [7]. A bipartite graph G = (X,Y) is said to be tree-convex if there is a 
tree (imaginary) on X such that the neighborhood of each y in Y is a subtree 
in X. Apart from linear ordering (path-convex ordering), tree-convex ordering, 
triad-convex ordering, and circular-convex ordering on bipartite graphs have 
been considered in the literature [8]. Further, the convex ordering on bipartite 
graphs yielded many interesting algorithmic results for domination and Hamil- 
tonicity [9]. Similarly, the feedback vertex set problem (FVS) is NP-complete 
on star-convex bipartite graphs, and comb-convex bipartite graphs, whereas it is 
polynomial-time solvable in chordal bipartite graphs and convex bipartite graphs 
[9]. Thus, the convex ordering on bipartite graphs reinforces the borderline sep- 
arating P-versus-NPC instances of many classical combinatorial problems. 

Since the tractability versus intractability status of many combinatorial prob- 
lems on bipartite graphs can be investigated with the help of convex ordering 
on bipartite graphs, we wish to extend this line of study to split graphs as well. 
To the best of our knowledge, this paper makes the first attempt in introduc- 
ing convex properties on split graphs. There are many classical problems such 
as Domination, Steiner tree, Hamiltonicity and its variants are NP-complete 
on split graphs. By imposing convex ordering on one of the partitions (clique 
or independent set), we wish to investigate P-versus-NPC status of STREE on 
this new subclass of split graphs. As part of this paper, we consider the follow- 
ing convex properties; star-convex, tree-convex, comb-convex, path-convex, and 
circular-convex split graphs. Further, we look at split graphs having convexity 
on I (independent set) and split graphs having convexity on K (clique). 

For tree-convex and circular-convex split graphs, the computational complex- 
ity of the following graph problems are studied in this paper. 


1. The Steiner tree problem (STREE). 
Instance: A graph G, a terminal set R C V(G), and a positive integer k. 
Question: Does there exist a set S C V(G)\ R such that || < k, and G[SUR] 
is connected ? 
2. The Dominating set problem (DS). 
Instance: A graph G, and a positive integer k. 
Question: Does G admit a dominating set of size at most k ? 
3. The Connected Dominating set problem (CDS). 
Instance: A graph G, and a positive integer k. 
Question: Does G admit a connected dominating set of size at most k ? 
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4. The Total Dominating set problem (TDS). 
Instance: A graph G, and a positive integer k. 
Question: Does G admit a total dominating set of size at most k ? 


All these problems are NP-complete for general graphs, bipartite graphs, and 
split graphs [1,8,10]. The complexity of these problems in tree-convex bipartite 
graphs have been considered in [9]. In this paper, we analyze the complexity of 
STREE in tree-convex split graphs and its subclasses (triad-convex, star-convex, 
comb-convex) with convexity on J (f). An interesting theoretical question is 


-What is the boundary between the tractability and intractability of STREE in 
split graphs ? 


In this paper, we answer this question by imposing a convex ordering on clique or 
independent set. In particular, we show that STREE is polynomial-time solvable 
for tree-convex split graphs with convexity on K, and is NP-complete for tree- 
convex split graphs with convexity on J. Further, we investigate path, triad 
convex properties, and show that STREE is polynomial-time solvable for triad 
(path)-convex split graphs with convexity on J and circular-convex split graphs. 


This paper is structured as follows: In Sect.2, the complexity results of convex 
split graphs with convexity on J is shown, and the complexity results of convex 
split graphs with convexity on K is shown in Sect. 3. Using the results of Sects. 2 
and 3, we establish the complexities of DS, CDS, and TDS in convex split graphs. 


Graph Preliminaries: In this paper, we consider connected, undirected, 
unweighted and simple graphs. For a graph G, V(G) denotes the vertex set 
and E(G) represents the edge set. For a set S C V(G), G[S] denotes the 
subgraph of G induced on the vertex set S. The open neighborhood of a ver- 
tex v is Ne(v) = {u | {u,v} © E(G)} and the closed neighborhood of v is 
Nev] = {v} U Ne(v). The degree of vertex v is dg(v) = |Ne(v)|. A split 
graph is a graph G in which V(G) can be partitioned into two sets; a clique 
K and an independent set J. A split graph is written as G = (K UTI,B£), 
where K is a maximal clique and J is an independent set. In a split graph, 
for each vertex u in K, N&(u) = Ne(u) NI, dh(u) = |N&(u)|, and for each 
vertex v in I, N§(v) = N(v) NK, d&(v) = |N&(v)|. For each vertex u in K, 
Né[u] = Ne(u) NIU {u}, and for each vertex v in I, NE [v] = N(v) NK U {v}. 
For a split graph G, AL = max{dh(u)},u € K and AS = max{d&(v)},u € I. 


Definition 1. A split graph G = (K UT, E) is called 7-convex with convexity 
on K if there is an associated structure 7 = (K,F) in K such that for each 
verter v € I, its neighborhood Ng(v) induces a connected subgraph in 7. 


Definition 2. A split graph G = (KUI, E) is called 7-convex with convexity on 
I if there is an associated tree n = (I, F’) in I such that for each vertex v € K, 
its neighborhood N&(v) induces a connected subgraph in 7. 


In general 7 can be any arbitrary structure. In this paper, We consider the 
following structures for 7; ’tree”, star”, ’ triad”, ”path”, and ”cycle”. 


118 A. Mohanapriya et al. 


For STREE, we solve for the case R = IJ, and for all the other cases, we obtain 
solutions using the following transformations and the algorithm for R = I. 
Case 1: R= KorRcCK. 

For R= K or RC K, the Steiner set is an empty set. 

Case 2: RCT. 

For R C I, we transform the graph G to G’ and the solution to G is obtained 
using G’. The transformation is defined as follows; G’ = G—I', where I’ =I-—R 
and R’ = R. 

Case 8: ROK AMand RNIF GO. 

Similar to Case 2, we obtain the solution for this case using the following trans- 
formation. Let W = RN K and G! = G— NE[W] - I’ with I’ = I — R. Let 
R’ =I’. Then the solution to (G, R) is obtained using (G’, R’). 


2 STREE in Split Graphs with Convexity on I 


When we refer to convex split graphs in this section, we refer to convex split 
graphs with convexity on J. For STREE on split graphs with convexity on I, 
we establish hardness results for star-convex and comb-convex split graphs, and 
polynomial-time algorithms for path-convex, triad-convex, and circular-convex 
split graphs. 


2.1 Star-Convex Split Graphs 


In this section, we establish a classical hardness of STREE in star-convex split 
graphs by presenting a polynomial-time reduction from the Exact-3-Cover prob- 
lem to STREE in star-convex split graphs with convexity on I. 

The decision version of Exact-3-Cover problem (X3C) is defined below: 


EXACT-3-COVER (X,C) 

Instance: A finite set X with |X| = 3q and a collection C = 
{Cy,Co,...,Cm} of 3-element subsets of X. 

Question: Is there a subcollection C’ C C such that for every x € X, x 
belongs to exactly one member of C’ (that is, C’ partitions X) ? 


The decision version of Steiner tree problem is defined below: 


STREE (G,B,k) 

Instance: A graph G, a terminal set R C V(G), and a positive integer k. 
Question: Is there a set S C V(G) \ R such that |S| < k, and G[S'U R] is 
connected ? 


Theorem 1. For star-convex split graphs with convexity on I, STREE is NP- 
complete. 
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Proof. STREE is in NP Given a star-convex split graph G with convexity on I 
and a certificate S C V(G), we show that there exists a deterministic polynomial- 
time algorithm for verifying the validity of S. Note that the standard Breadth 
First Search (BFS) algorithm can be used to check whether G[.SUR] is connected. 
It is easy to check whether || < k. The certificate verification can be done in 
O(|V(G)| + |E(G)|). Thus, we conclude that STREE is in NP. 


STREE is NP-Hard It is known [11] that X3C is NP-complete. X3C can be 
reduced in polynomial time to STREE in star-convex split graphs with convexity 
on I using the following reduction. We map an instance (X,C) of X3C to the 
corresponding instance (G, R,k) of STREE as follows: V(G) = Vi UVa, Vi = 
{c; | 1<i< m}, Y= {x1, 22, ree ,£3q,23q41}, E(G) = 1 {55.05} | ry E Cy,1< 
J <3q,1 <i < mbU {{azq41,cG} | 1 <i < m}U ({a,ce} | 1 <i< 7 < m}. 
Let R= Va, k = q. Note that G is a split graph with V, being a clique and V2 
being an independent set. Now we show that G is a star-convex split graph by 
defining a star T on V2 as follows: 

Let V(T’) = V2 and E(T) = {{x3q41, ti} | 1 <i < 3q}. We see that x39+1 is the 
root of the star T. 

An illustration for X3C with X = {2 1,%2,%3,04,05,%e} and C = {C, = 
{x1, v2, x3}, Co = {x2, 23, xa}, C3 = {x1, U2, Xs}, C4 — {xo, X5, xe}, Cs — 
{x1,25,2¢}} and the graph G with R = I, k = 2 and imaginary star on I 
rooted at x7 corresponding to X3C is shown in Fig.1. For C’ = {Co,Cs} and 
k = 2, we see that S = {co,c5} is the desired Steiner set. 


Imaginary star of G with respect to I 


Clique on K (Ks) 


Fig. 1. Reduction: An instance of X3C to STREE in star-convex split graphs with 
convexity on I 


Claim. Exact-3-Cover (X,C) if and only if STREE (G, R, 41) 


Proof. Only if: If there exists C’ C C which partitions all elements of X, then 
the set of vertices S = {c; € Vi | C; € C’}, where c; is the vertex corresponding 
to C;, forms a Steiner set S with R = V2. Also, note that |S| = ¢. 


120 A. Mohanapriya et al. 


If; Assume that there exists a Steiner tree in G for R. Let S C Vy be the 
Steiner set, |S| = g. We now construct a C’ = {C; € C | c; € S} to X38C. Since 
|S| = gq, we have |C’| = q. Further, S is the Steiner set for the terminal set 
R= {z1,...,£3q,03q41}- For any cj € S, we have |N&(ci) \ {z3q+1}| = 3. Since 
|S| =q, for all ci,c; € Si Aj, NG(a) MN N&(c;) = {x3q41}. Therefore, C’ is the 
corresponding solution to X3C. 


Therefore, STREE is NP-complete on star-convex split graphs with convexity 
on I. 


Corollary 1. STREE in tree-convex split graphs with convexity on I is NP- 
complete. 


Proof. Since star-convex split graphs are a subclass of tree-convex split graphs, 
from Theorem 1 this result follows. 


The study of parameterized complexity is concerned with designing algorithms 
with complexity f(k)nP“, where k is the parameter of interest, usually the 
solution size, and n is the input size. In [12] it is shown that STREE in general 
graphs is Fixed-parameter Tractable (FPT) if the parameter is the size of the 
terminal set. 

It is known [13] that STREE in general graphs with parameter |.S| is W[2]- 
hard. We now prove a similar result for our graph class. 


Theorem 2. For star-convex split graphs with convexity on I, STREE is W/1]- 
hard with parameter |S\. 


Proof. It is known [14] that the Exact Cover problem (generalization of X3C) 
with parameter |C’| is W[{1]-hard. Note that the reduction presented in Theorem 
1 maps (X,C) to (G, R,k = q), where q is |C’|. Hence the reduction is a parame- 
terized reduction. Therefore, STREE in star-convex split graphs with convexity 
on I is W[{1]-hard. 


2.2 Comb-Convex Split Graphs 


To strengthen the NP-completeness result of STREE on tree-convex split graphs 
with convexity on J, we show that on comb convex split graphs with convexity 
on I, STREE is NP-complete. The polynomial-time reduction is from the vertex 
cover problem on general graphs. 

The decision version of Vertex Cover problem (VC) is defined below: 


VC (G,k) 

Instance: A graph G, a non-negative integer k. 

Question: Does there exist a set S C V(G) such that for each edge e = 
{u,v} € E(G), we Sorve Sand |S|<k? 


Theorem 3. For comb-convex split graphs with convexity on I, STREE is NP- 
complete. 
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Proof. STREE is NP-Hard: It is known [15] that VC on general graphs is 
NP-complete and this can be reduced in polynomial time to STREE in comb- 
convex split graphs using the following reduction. We map an instance (G,k) of 
VC on general graphs to the corresponding instance (G*, R,k’ = k) of STREE 
as follows: V(G*) = Vi UV. UVs, 

Vi = {x | vi € V(G)}, 

V2 = {yi | e: € E(G)}, 

V3 = {z; | e; © E(G)}. 

We shall now describe the edges of G*, 


BG) =F) UU Bs, 

Fr = {{yi, Ze}, {yi, ti}, | ei = (Ue, i} € A(G), Te, tie Vi, YE V2, 1<Si< 
m,1<k<n,1<l<n} 

Eo = {{z,zi}} | ae, 2 € V3, 1<i<m} 

E3 = {{xi,0;}|1<i<j <n}. 


We define K = V,, I = V2 U V3, and imaginary comb T on I is defined with 
V3 as the backbone and V2 as the pendant vertex set. That is, V(T) = J and 
E(T) = {{y1, Zz}; {ye, z2}, ttt {yi zi} | Lees m}. 

An example is illustrated in Fig. 2, the vertex cover instance G(V, £) with 
k = 2 is mapped to STREE instance of comb-convex split graph G*(V“*, E*) 
with R= {y1,y2, ys}, ki = 2. 


YY. Y2 Y3 


Imaginary comb of G* with respect to I 


Fig. 2. An example: VC reduces to STREE. 


Claim. G* is a comb-convex split graph with convexity on J. 


Proof. For each x; € Vi, N&(ai) = V3 UW, W C Vo. By construction 2; is 
adjacent to all of V3. Therefore, the graph induced on V3U W is a subtree in G*. 
Hence G* is a comb-convex split graph with convexity on I. 
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Claim. (G,k) has a vertex cover with at most k vertices if and only if (G*, R= 
{y; | 1 < i < m}},k’ = k) has a Steiner tree of size at most k’ = k Steiner 
vertices. 


Proof. (Only if) Let V’ = {v; | 1 <4 < k} is a vertex cover of size k in G. 
Then we construct the Steiner set S of G* for R= {y; | 1 <i < m} as follows: 
S={aj,|1<i<k, u€ VV’, « € V(G*)}. Since V’ is a vertex cover, for 
any edge e; = {vg,u} € E(G), vg or y is in V’. Hence S$ contains xz or 2. 
Therefore, for each vertex y;, there exists a neighbor in S. Since Vj is a clique 
by our construction, G[R U S] is connected. 

(if) For R in G*, let S = {a; | 1 <i < k’} is a Steiner set of G* of size k’. 
Then, we construct the vertex cover V’ of size k in G as follows; V’ = {u; | 
a, € S, vu; € V(G), 1 <i < k’}. We now claim that V’ is a vertex cover in G. 
Suppose that there is an edge e; = {vz, u.} € E(G) for which neither vz nor v; is 
in V’. This implies that neither 2, nor x is in S. Since R contains y;, it follows 
that N(y;)S = 0. Thus S is not a Steiner set. A contradiction. Therefore, V’ 
is a vertex cover of size k in G. 


Therefore, STREE is NP-complete on comb-convex split graphs with convexity 
on I. 


A closer look at the reduction reveals that the presence of pendant vertices in the 
comb makes the problem NP-hard. Therefore, we shall investigate the complexity 
of STREE in a variant of comb-convex split graphs where there are no pendant 
vertices in the comb which is precisely the class of path-convex split graphs. 
Interestingly STREE in path-convex split graph is polynomial-time solvable, 
which we establish in the next section. 


2.3. Path-Convex Split Graphs 


In this section, we show that STREE in path-convex split graphs is polynomial- 
time solvable. Recall that a split graph G is called path-convex if there exists a 
linear ordering of vertices in I such that for each u € K, N(u) is a sequence of 
consecutive vertices. 

Let G be a path-convex split graph. Let the vertices in I be 21,...,2, and 
let the vertices in K be wy,...,ws. We know that there exists an imaginary 
path on I and for each vertex u € K, N&(u) is a subpath in the imaginary 
path. For each u € K, I(u) is the least vertex in N&(u), and r(u) is the great- 
est vertex in NZ(u). For each vertex x; € I, we define T(x;) = {u | u € 
N(a;), and r(u) is maximum}. Let w(«#;) be an arbitrary vertex from T(2;). 

Steiner tree algorithm for path-convex split graphs identifies the vertex w € 
K adjacent to x, such that r(w) is maximum and we continue this from r(w). 
This greedy approach is indeed optimum, which we establish in this section. 

The following algorithm computes a minimum Steiner set for G. 
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Algorithm 1. STREE for path-convex split graphs with convexity on I. 
: Input: A gee path-convex split graph G with convexity on J and R= I. 
Let a=21, S= {}. 
Let S=SU ia 
if x, € N(w(a)) then 
Output S. 
else 
Let x; be r(w(a)). 
Let a = xj41 and continue from Step 3. 
end if 


By r(wi) x r(w;), wi,w; € K, we mean that the vertex r(w;) appears before 
r(w;). Let S = {u1,...up} be the Steiner vertices chosen by the algorithm. Note 
that as per our algorithm r(ui) < r(w2) x... < r(up). Let S” = {u,...up} be 
the Steiner vertices chosen by any optimal algorithm. Without loss of generality, 
we arrange S” such that r(v1) X r(v2) X ... x r(vq). 


Theorem 4. For1<k <q, let x; =r(ug), and x; = r(vz). Then, j > 1. 


Proof. By mathematical induction k, k > 1. 

Base Case: For k = 1, Algorithm 1 has chosen u;. Since by Step 2 of the 
algorithm, r(wi) > r(u,;) for any u; € N(w1). Therefore, r(u1) > r(vi). Thus 
j = lis true for the base case. 

Induction Hypothesis: Assume that for k > 2, 7 > 1 is true. 

Induction Step: We prove that at k+1 iteration, k > 2, 7 > J. By our induction 
hypothesis, we know that up to kth iteration j > l. We know that 2; = r(ux), 
and x; = r(v,). The vertex chosen by the algorithm at (& + 1)th iteration be 
Up+1- Since 9’ is the optimal solution, let vz41 be the (k + 1)!” vertex in S’ such 
that r(vz+1) is either less than or equal to x; or adjacent to 7j4+1. 

Case 1: If r(vp41) is less than or equal to 2; and we know that at (k + de 
iteration r(ug4i) = x, then r(up41) > r(ve+41)- 
Case 2: Consider the case when r(vg+1) is adjacent to 2;41. We know that vp41 
is in S’ such that S’N.N(«;41) 4 9. Suppose that r(we41) X r(ve4i). At (k+1)th 
iteration, according to Step 3, and Step 2 of the algorithm, w(#;+1) is included in 
S. Hence ug41 = w(#;41). Recall that as per our definition of w(x;+1), it is the 
vertex adjacent to 2;+41 such that r(u,) is maximum. Thus r(ug+1) < r(ve41) is 
a contradiction. Therefore, r(ux41) = r(ve41), and at (k +1)" iteration, j >. 


Theorem 5. Algorithm 1 outputs a minimum Steiner set, that is p= q. 


Proof. By Theorem 4, we know that r(u,) = r(vq), and hence |.S| < |S’|. Since 
S’ is an optimal solution, |. > |S’|. By Step 3 of the algorithm we know that 
tn € N(ux). Therefore, |.S'| = |S’|, and p = gq. 


It is easy to see that Algorithm 1 runs in time O(mn). 
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Now we see that for comb-convex split graph, STREE is NP-complete, whereas 
for path-convex split graph, STREE is polynomial-time solvable. This clearly 
brings out P-versus-NPC investigation of STREE in tree-convex split graphs. 
This is one of the objectives of this research. 

It is important to highlight that we can solve STREE on triad-convex and 
circular-convex split graphs by using the algorithm of path-convex split graphs 
as a black box. 


3 STREE in Split Graphs with Convexity on kK 


Having analyzed P-versus-NPC status of STREE for convex split graphs with 
convexity on J, we shall now analyze the same with respect to split graphs having 
convexity on K. 


3.1 Tree-Convex Split Graphs 


In this section, we present a polynomial-time algorithm to find a minimum 
Steiner tree in tree-convex split graphs. We solve for first the case R = I, and 
using which we solve (i) R C I and (ii) RANK 4 @ and RNI FO. Let G(K UI, E) 
be a tree-convex split graph with a imaginary tree T’. 

To construct the Steiner tree for G, we use the following scheme. Initially 
all vertices in the imaginary tree T is colored gray. The vertex colored gray is 
changed white or black as per the following rules: 

We recolor the gray colored vertex as white or black as per the following rules: 
Rule 1:(Gray is colored white) A leaf vertex u € T recolored white when there 
does not exist a pendant vertex in N&(u). 

Rule 2:(Gray is colored Black) A leaf vertex u € T recolored white when there 
exists a pendant vertex in NZ(u). 

The algorithm that computes Steiner tree for the case R = I works with imagi- 
nary tree T. 

The Steiner tree algorithm R = J starts from an arbitrary leaf vertex, say 
u € T. Check with respect to u which rule is applicable. If Rule 1 is applied, then 
G is modified to G = G — u. If suppose Rule 2 is applied, then G is modified to 
G = G— N&(u). We continue the process for |K| — 1 times. In this process the 
vertices that are colored black are included in the solution and they are precisely 
the Steiner set, which we prove in Theorem 6. 


Theorem 6. The Steiner set S obtained using the above procedure is a mint- 
mum Steiner set. 


Proof. On the contrary, there exists a Steiner set S’ for G such that |.S”| < |S]. 
Since |.S’| < |S|, the coloring obtained from the algorithm is not optimal. Hence 
there exists a coloring such that number of white vertices are more, and the 
number of black vertices are less compared to the coloring obtained from the 
algorithm. 
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A gray colored leaf vertex u is colored black, when there exists a pendant vertex 
x € I adjacent to u in that instance. We know that G is a tree-convex split 
graph with convexity on K, let u be a leaf vertex in T, and let v be a unique 
vertex adjacent to u in T. Suppose that u is not adjacent to a pendant vertex 
x € I. Then N(u) C N(v). Hence including v in the Steiner set instead of u will 
connect more number of vertices. Suppose that u is adjacent to a pendant vertex 
x € I. Then uw is colored black. This is the invariant followed by the algorithm. 

Hence the coloring having less number of black vertices compared to the 
coloring obtained from our algorithm is not possible. Therefore, |S’| < |S| is a 
contradiction and S is a minimal Steiner set. 


Remarks: Since STREE in tree-convex split graphs with convexity Kis 
polynomial-time solvable, STREE is polynomial-time solvable on well known 
special structures such as star, path, triad, and comb-convex split graphs with 
convexity on K. It is important to note that the above approach can be used as 
a black box for STREE on circular-convex split graphs. 

Since Steiner set for convex split graphs S C K, we observe that S is also 
DS, CDS, and TDS. The P-versus-NPC status of STREE for convex properties 
discussed in this paper also holds true for DS, CDS, TDS. 


Conclusion and Directions for Further Research: 

We have shown the complexity of STREE, and domination and its variants 
in tree-convex and circular-convex split graphs. The results presented in this 
paper can be used as a framework for Steiner tree variants (Steiner path and 
cycle) and domination problems (outer connected domination, Roman domina- 
tion) restricted to split, and bipartite graphs. 
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Abstract. A k-cd-coloring of a graph G is a partition of the vertex set 
of G into k independent sets Vi,..., Vx, where each V; is dominated by 
some vertex of G. The least integer k such that G admits a k-cd-coloring 
is called the cd-chromatic number, yca(G), of G. We say that S C V(G) 
is a subclique in G if dg(x,y) 4 2 for every x,y € S. The cardinality 
of a maximum subclique in G is called the subclique number, ws(G), of 
G. Given a graph G and k €N, the problem CD-COLORABILITY checks 
whether xca(G) < k. The problem CD-COLORABILITY is NP-complete 
for K4-free graphs [Merounane et al., 2014], Ps-free graphs, and chordal 
graphs [Shalu et al., 2020]. In this paper, we show that the problem cb- 
COLORABILITY is O(n?)-time solvable in the intersection of the above 
graph classes ({Ps,K4}-free chordal graphs). The problem SUBCLIQUE 
takes a graph G and k € N as inputs and checks whether ws(G) > k. 
The SUBCLIQUE problem is NP-complete for P¢-free graphs and bipartite 
gaphs [Shalu et al., 2017]. We prove that the problem SUBCLIQUE is 
O(n3)-time solvable in the class of Ps-free chordal bipartite graphs (a 
subclass of Ps-free bipartite graphs). In addition, we show that the cd- 
chromatic number and the subclique number are equal in these two graph 
classes. 


1 Introduction 


A proper (vertex) coloring f of a graph assigns colors to its vertices such that adja- 
cent vertices receive distinct colors. The set of vertices that receive the same color 
is acolor class. Graph coloring is one of the classical problems in the field of combi- 
natorics. The problem aims at reducing the number of colors required for a proper 
(vertex) coloring. The minimum number of colors required to color a graph G with 
adjacent vertices receiving different colors is called the chromatic number of G, 
denoted by x(G). Another well-studied problem in this field is the dominating set 
problem. We say that a subset D of the vertex set V of a graph G dominates V 
if every vertex in V \ D is adjacent to some vertex in D. The cardinality of the 
smallest dominating set in a graph G is called domination number, and is denoted 
by y(G). The dominator coloring problem [1,2,6,7] and the class domination col- 
oring problem [11,16] are two emerging problems in graph theory involving both 
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coloring and domination. In a vertex coloring f of a graph G, if each color class of 
f is dominated by some vertex in G,, then it is called a class domination coloring 
(cd-coloring) of G [11,16]. The minimum number of colors required for cd-coloring 
a graph G is called the cd-chromatic number, Xca(G), of G. If two vertices receive 
the same color under some cd-coloring of G, then the distance between them in 
G is two: i.e., any two vertices not at distance two will receive different colors in 
any cd-coloring of G. This observation gives a lower bound for the cd-chromatic 
number, called the subclique number. For a graph G, a subset S' of the vertex set is 
a subclique if no pair of vertices in S are at distance two from each other in G. The 
cardinality of a maximum subclique in G is called the subclique number, ws(G), 
of G. Note that the cd-chromatic number of a graph G is at least the subclique 
number of G, i.e., ¥ea(G) > ws(G). The computational version of the problems 
cd-coloring and subclique are as follows. 


CD-COLORABILITY SUBCLIQUE 
Instance : A graph Gand keéEN. Instance : A graph G andk EN. 
Question : Is G k-cd-colorable? Question : Is w,(G) > k? 


The problem CD-COLORABILITY is NP-complete for bipartite graphs [11], 
K,4-free graphs [11], Ps-free graphs [15], and chordal graphs [15]. The problem 
k-CD-COLORABILITY is NP-complete for k > 4 [11] and is O(n°)-time solvable for 
k < 3 [13]. Kiruthika et al. [8] gave an O(2"n* log n)-time algorithm to find the 
cd-chromatic number of a graph with n vertices. They also gave FPT algorithms 
for general graphs and chordal graphs. Chen [3] proved that the cd-chromatic 


number is hard to approximate within a factor of cd ei for every € > 0. 
Das and Mishra [5] proved (approximation) hardness results of the problem CD- 
COLORABILITY in chordal graphs and bipartite graphs. Shalu et al. [15] proved 
that an optimal cd-coloring of split graphs, P,-free graphs, and claw-free graphs 
can be found in poly-time. Applications of the cd-coloring problem can be found 
in [3,9, 14]. 

For a graph G, every clique in G is also a subclique in G, and thus w,(G) > 
w(G). The SUBCLIQUE problem is NP-complete for chordal graphs, bipartite 
graphs, P-free graphs, and H-free graphs where H is any graph on four or five 
vertices other than P, [14]. The problem is poly-time solvable for split graphs and 
P,-free graphs [14]. In our previous work [12], we proved that the cd-chromatic 
number and the subclique number are equal for trees and co-bipartite graphs. 

In this paper, we prove that the cd-chromatic number and the subclique 
number of a { Ps, K4}-free chordal graph can be found in O(n?) time (see Sect. 3). 
Also, a maximum subclique of a Ps-free chordal bipartite graph can be found in 
O(n?) time (see Sect. 4). In addition, we prove that for any graph G in one of 
the above two graph classes, ¥ca(G) = ws(G). 


2 Preliminaries 


We follow West [17] for terminology and notation. The number of vertices and 
the number of edges in a graph are denoted by n and m, respectively. We denote 
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a path with vertex set {21,...,2,} and edge set {ajaj41 : 1 <i < k—1} by 
%1:+:X,. A clique C of a graph G is a subset of the vertex set such that every 
pair of vertices in C are adjacent. The clique number is the size of a largest clique 
in G and is denoted by w(G). For a graph G, we say that D C V(G) is a total 
dominating set in G if every vertex of G has a neighbor in D. The cardinality of 
a smallest total dominating set in G is called the total domination number of G, 
denoted by 7,(G). For Y C V(G), G[Y] denotes the induced subgraph of G with 
vertex set Y. We denote the length of a shortest path joining x and y in G by 
dg(«,y). For a vertex u € V(G) and a subset W of the vertex set, dg(u, W) = 
min{dg(u, w)|w € W}. Given a graph H, we say that a graph G is H-free if 
no induced subgraph of G isomorphic to H. For x € V(G), the neighborhood 
of x is defined as N(x) = {y € V(G)|ay € E(G)} and N{a] = {x} U N(z) is 
the closed neighborhood of x. Let A(x) = V(G) \ N[a]. For X,Y C V(G) and 
X MY =, we define [X,Y] to be the set of all edges in G with one end vertex 
in X and the other end vertex in Y. A set D C V(G) is said to be a biclique in 
G if D can be partitioned into two non-empty independent sets X and Y such 
that every vertex in X is adjacent to each vertex in Y. We say that a graph H 
is complete bipartite if V(H) is a biclique. 


3 {Ps, K,}-free Chordal Graphs 


In this section, we show that for a { Ps, K4}-free chordal graph G, yca(G) = 
ws(G). Also, we prove that the cd-chromatic number and the subclique number 
of G can be found in O(n?) time. 


Observation 1. If G is a connected Ps-free graph with a maximal clique No, 
then the vertex set can be partitioned as V(G) = NoUN,U No where N; = {u€ 
V(G) |d(u, No) = i} fori =1,2. 


Proof of Observation 1 is omitted in this paper. 


Observation 2. If G is a connected Ps-free chordal graph with w(G) = 2, then 
G is a tree with a dominating edge and hence Xca(G) = ws(G) = 2. (For an 
example see Fig. 1) 


Observation 3. Let G be a connected P;-free chordal graph with clique number 
three. Also, let No = {X0,21,22} be a maximum clique in G and let V(G) be 
partitioned as in Observation 1. Then the following holds. 


1. Let M; = {u E N, | N(u) No = {a;}} and Li = {u € N, | N(u) No = 
No \ {xi}}. Then, 
(i) [(M;,M;] =9, fori # j, else G contains an induced C4. 
(ii) [L;,L;) =, fori # j, otherwise G contains an induced C4. 
(itt) [M;, Li] = 9, else G contains an induced C4. 
(iv) For i = 0,1,2, L; is an independent set, otherwise G contains a K4. 
Thus, UE L; is an independent set since [L;, L;| = 0. 
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2. Ifu,v € Ny such that uv is an edge in G, then N(u) A No © N(v) A No or 
N(v) 1 No C N(u) A No. Otherwise, G contains an induced C4. 


Lemma 1. Let G be a connected Ps-free chordal graph with w(G) = 3. Let the 
vertex set be partitioned as in Observation 1 where No is a maximum clique in 


G, and No #0. Then, (Uzen, N(X)) M1 is a clique of size one or two. 


Proof of Lemma 1 is omitted in this paper. 


Lemma 2. Let G be a connected P;-free chordal graph with w(G) = 3. Let 
No = {X0, 21,22} be a clique in G, and let the vertex set be partitioned as in 
Observation 1. Then, G has a minimal dominating clique C such that C C 
No UN, and CON, #9. 


Proof. 

Case 1: N2 is empty. 

Then, V(G) is dominated by No. Thus, a subset of No forms the minimal dom- 
inating clique C. 

Case 2: No is non-empty. 

By Lemma 1, N2 is dominated by B = (U,¢y, N(x)) A Ni which induces an 
edge or a vertex in G[N;]. Let C, be a minimal subset of B that dominates No. 
Case 2.1: |Ci| = 2. 

Let C, = {u,v} where wv € E(G). Then, by Observation 3.2, N(u)M No C 
N(v) A No or N(v) No C N(u) 1 No. Without loss of generality, assume that 
N(u)N No C N(v) N No. Since u € Nj, u has a neighbor in No. Let xo € N(w). 
Then, 21,72 ¢ N(u) because when x u or xgu is an edge in G, {zo, 21, u,v} or 
{xo,@2, u,v} induces a Ky in G. Note that xov € E(G) and {xo, u,v} induces a 
triangle in V(G). 


Claim 1: D = {xo,u,v} dominates V(G). 

Clearly, the set No U No U {u,v} is dominated by D. Assume that there exists 
a vertex y € N; not dominated by D. Since y € Nj, y has a neighbor in No\D. 
Let yx; € E(G). Then, {y,21,20,u, w} induces a P; in G for some w € N2M 
N(u) (since (Uren, N(#)) A M={u, v}, yw ¢ E(G)), a contradiction. Thus, D 
dominates V(G). 


Claim 2: D is a minimal dominating set in G. 

On the contrary, suppose that there exists a proper subset D’ of D that domi- 
nates V(G). Since 29 does not dominate No, u or v belongs to D’. Without loss 
of generality, assume that u € D’. If vu ¢ D’, then u dominates No, a contra- 
diction to C, = {u,v} being the minimal subset of B that dominates No. This 
implies that u,v € D’. Therefore, 29 ¢ D’ because D’ is a proper subset of D 
and u,v € D’. This implies that D’ = {u,v} dominates No. Also, we know that 
N(u)M No C N(v) 1 No. Thus, No C N(wv). This implies that {zo, 271,22, v} 
induces a Ky in G, a contradiction. Thus, D is a minimal dominating clique. 

Also, since 79 € D, DN No #0. 
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Case 2.2: |Ci| =1. 

Let C, = {u} and D; = N(u)N No. Since u € M1, Di # G. Also, since G is 
K4-free u has a non-neighbor in No. Without loss of generality, assume that 
to € D, and x1 ¢ Dy. 


Claim 3: Cz = D, U {u} dominates V(G). 
Clearly, No U No U {u} is dominated by C2. Suppose that C2 does not dominate 
V(G). Then, there exists a vertex y € N; not adjacent to any vertex in C2. Since 
y € Ni, y has a neighbor in No \C 2. Let x, € N(y). Then, {y, 21, 29, u, w} forms 
an induced Ps for some w € No, a contradiction. Hence, Cz dominates V(G). 

Thus, by Claim 3, there exists a subset C' of Cp that dominates V(G). Since 
x, € N(u), in order to dominate x1, C contains an element of No. 

From the above cases, it is evident that G has a minimal dominating clique 
C such that C C Np UN, and CN No £90. 


Corollary 1. For a connected P;-free chordal graph with clique number three, 
a minimal dominating clique described in Lemma 2 can be found in O(n + m) 
time. 


Proof of Corollary 1 runs BFS with respect to some triangle in a connected 
Ps-free chordal graph to find the sets No, Ni and N2, and the rest of the proof 
follows from Lemma 2. A detailed proof is omitted in this paper. 

In the following observations and lemmas, we study the subclique number 
and the cd-chromatic number of a connected P;-free chordal graph with clique 
number three based on the size of its minimal dominating clique. These obser- 
vations and lemmas help us to prove our theorem (Theorem 2) on {P5, K4}-free 
chordal graphs. 


Observation 4. 


1. If G is a graph with a universal vertex, then every subclique in G is also a 
clique and every proper coloring of G is also a cd-coloring. Thus, Xca(G) = 
(G) [15] and w4(G) = w(G) [14]. 

2. Let G be a graph. Then, for any subclique S in G, SN N{u] (respectively SN 
N(u)) is a clique for every vertex u € V(G) since SM N[ul (respectively SM 
N(u)) is a subclique in the induced subgraph G[N|ul] (a graph with a universal 
verter u). 


Theorem 1. /12/ If G is connected graph with a dominating clique D, then for 
any subclique S in G, the following statements are true. 


(i) If DNS £9, then |S| < w(G). 
(ii) If DOS =, then |S| < |D|(w(G) — 1). 
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Fig. 1. An edge dominated tree E46 with a dominating edge yg Za. 


We denote a tree T’ with a dominating edge uv as E,,q where p = |N(u)| and 
q = |N(v)|. Clearly, E1,¢ = K1,q. Figure 1 shows the edge-dominated tree E16. 


Observation 5. A tree T is Ps-free if and only if T = Epq for some p,q € N 
or T = ky (see Observation 2). 


Observation 6. Let G be a Ps-free chordal graph with a universal vertex u and 
w(G) = 3. Then, Xea(G) = x(G) = 3 = w(G) = w,(G) by Observation 4. Also, 
since G \ u is a forest, every connected component of G \ u is isomorphic to Ky 
or Enq for some p,q € N by Observation 5. 


Observation 7. Let G be a P;-free chordal graph with w(G) = 3. Then, G[N(wu)] 
is triangle-free for every verter u € V(G). Hence, G[N(u)]| is a Ps-free forest. 


Lemma 3. Let G be a connected Ps-free chordal graph with w(G) = 3, and 
let uv € E(G) such that D = {u,v} is a minimal dominating set in G. Then, 
Xea(G) = ws(G). 


Proof. Clearly, 3 = w(G) < w,(G). By Theorem 1, ws(G) < 4. So, ws(G) € 
{3,4}. 


Case 1: ws(G) = 3. 


Claim 1: N(u) \ N(w) or N(v) \ N(w) is an independent set. 

If not, let y1, y2 and 21, z2 be vertices in N(u)\N(v) and N(v)\N(u) respectively 
such that yiy2, 2122 € E(G). Since w,(G) = 3, {y1, ya, 21, 22} is not a subclique, 
and thus at least two of the four vertices are at distance two from each other. 
Without loss of generality, assume that d(y1, 21) = 2. Hence, y; and z, have 
a common neighbor w in G. Note that w ¢ {u,v} because z,u,yiv ¢ E(G). 
Since {y1, w, 21, u,v} does not induce a Cs nor its subsets induce a Cy, in G, 
uw,vw € E(G). This implies that w ¢ {y1, y2, 21, 22}. Also, w is neither adja- 
cent to y2 nor adjacent to z2 because G[N(wu)] and G[N(v)] are triangle-free by 
Observation 7. Also, yz; ¢ E(G) for all i,7 € {1,2} (otherwise {y;, u,v, z;} 
induces a Cy in G). Thus, {y2, yi, w, 21, 22} induces a Ps in G, a contradiction. 
Therefore, N(u) \ N(v) or N(v) \ N(u) is an independent set. 


Let N(v) \ N(w) be an independent set. By Observation 7, G[N(wu)] is bipar- 
tite. Let X UY be a partition of N(w) into independent sets. Then, V(G) = 
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XUYU(N(v)\N(u)) is a 3-cd-coloring of G where color classes X and Y are dom- 
inated by u and N(v) \ N(u) is dominated by v. Thus, 3 = ws(G) < vea(G) < 3. 
Hence, Xcaq(G) = w;(G) = 3. 


Case 2: w.(G) = 4. 
Clearly, yea(G) > 4. Then, N(w) \ N(v) and N(v) \ N(wu) aren’t independent 
sets (otherwise G admits a 3-cd-coloring as in Case 1). 

Let X; UY; = N(u) and X2U Y2 = N(v) \ N(wu) be bipartitions of graphs 
G|N(u)| and GLN(v) \ N(u)], respectively. Then, V(G) = X1 U X2U Yi UY is 
a 4-cd-coloring of G where vertices in X; and Y; are dominated by u and the 
vertices in X2 and Y2 are dominated by v. Thus, 4 = w,(G) < ¥ea(G) < 4. This 
shows that yYea(G) = ws(G) = 4. 


Observation 8. Let G be a connected Ps-free chordal graph with a minimal dom- 
inating clique {u,v} and w(G) = 3. Then, finding the subclique number of G is 
the same as checking whether N(u) \ N(v) and N(v) \ N(u) are independent sets 
by Lemma 3 (Case 1 of Lemma 3 shows that if w,(G) = 3 then at least one of 
N(u)\ N(v) and N(v) \ N(u) is an independent set, and Case 2 of Lemma 3 shows 
that if w.(G) = 4 then neither N(u) \ N(v) nor N(v) \ N(u) is an independent 
set). The latter problem can be solved in O(n?) time. Hence, the subclique number 
and the cd-chromatic number of G can be found in O(n?) time. 


Observation 9. Let No = {20,21,22} be a minimal dominating clique in a 
connected Ps-free chordal graph G with w(G) = 3. Also, let Nj = V(G) \ No. 
Then, for sets My and Ly defined as in Observation 3.1, [Mo, Lz] = 0. In general, 
[M;,L;]|=@ for i,j € {0,1,2} andi FZ 7. 


Observation 10. Let No = {x0,%1,X%2} be a minimal dominating clique in a 
connected P;-free chordal graph G with w(G) = 3. Let N, = V(G) \ No. Then, 
foru€ M; and v € M,;, d(u,v) 4 2 where M;’s are as in Observation 3.1 and 
itj. 


Proof of Observations 9 and 10 are omitted. 


For a graph G with clique number three and a dominating clique of size three, 
w.(G) € {3,4,5,6} by Theorem 1. We study the structure of G when w,(G) = 3 
in Observation 11 and in Observation 12 we discuss the structure of G when 
ws(G) = 4,5,6. These Observations help us to prove Lemma 4. 


Observation 11. Let G be a connected Ps-free chordal graph with w(G) = 
ws(G) = 3. Let No = {20, 21,22} be a minimal dominating clique in G. Then, 
Xea(G) = 3 and V(G) \ {x0, 41, 22} ts an independent set. 


Proof. Let N, = V(G)\No, and let M; and L; be sets defined in Observation 3.1. 
Claim 1: V(G) \ {%o, #1, £2} is an independent set. 

Suppose that there is an edge upvo in V(G) \ {%0, 21,22}. Then, by Obser- 
vations 3.1 and 9, uo,vo € M; for some i = 0,1,2. Without loss of gen- 
erality, assume that uo,vo € Mo. Since {29,271,272} is a minimal dominat- 
ing set in G, there exist vertices uy € N(a#1)M A(ao) M A(ae) = My and 
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ug € N(a2)N A(xo) MN A(a1) = Me. Then, {uo,u1,u2,v0} forms a subclique 
of size four (because d(uo,vo) = 1, d(uj,ux) A 2 and d(vo,u;) # 2 for 
j,k © {0,1,2} by Observation 10). This contradicts the fact that w,(G) = 3. 
Hence, V(G) \ {%o, #1, 22} is an independent set. 

Let X; = {xi+i} U (N(2i) N A(2i41)) = {xi41} U M; U Dian for i = 0, 1,2: 
from here onwards in this section, we take all the subscripts of 7;, M; and LD; to 
be mod 3. Each X; is an independent set since M; U Li41 C V(G) \ {20, 21, 22} 
is an independent set and x;41 is not adjcent to any vertex in M; U Lj;41 by 
definitions of M; and Lj41. 
Claim 2: V(G) = Xo U X1 U XQ is a 3-cd-coloring of G. 

Let « € V(G) \{zo0, 21,22}. Then, x € N(a;) for some i = 0,1,2. If x € A(aj41), 
then « € N(a;)N A(aiz1) C X;. Else, xaj41 € E(G). Since {2;, vj41, vipa, 0} 
does not induce a Ky, vxj42 ¢ E(G). This implies that  € N(aj41)NA(a@i42) C 
X;41. Thus, every vertex in G belongs to X; for some i = 0,1,2. Also, by the 
definitions of M; and Li41, each X; is dominated by x;. Hence, V(G) = Xo U 
XU Xp» forms a 3-cd-coloring of G. This implies that 3 = ws(G) < vea(G) < 3. 
Thus, Xca(G) = ws(G) = 3. 


Observation 12. Let G be a connected Ps-free chordal graph with w(G) = 3 and 
ws5(G) = 7 for some j = 4,5,6. Let No = {x0, 21, %2} be a minimal dominating 
clique in G. Then, there exists 7 — 3 integers i1,...,1;-3 € {0,1,2} such that 
V(G) \ (N(ai,) U.-. UU N(ai,_,)) is an independent set, and every connected 
component of G[N(2;,)],---,G[N(xi,_,)] is isomorphic to Ky or Ey,q for some 
p,q EN. Also, ¥ca(G) = j. 


Proof of Observations 12 is omitted in this paper. 


Lemma 4. Let G be a connected P;-free chordal graph with w(G) = 3, and let 
{xo, 1,22} be a minimal dominating clique in G. Then, Xca(G) = ws(G). 


Proof. Since w(G) < ws(G), it is clear that w,(G) € {3,4,5,6} by Theorem 1. 
The rest of the proof follows from Observations 11 and 12. 


Observation 13. The subclique number and the cd-chromatic number of a con- 
nected P;-free chordal graph with clique number three and a minimal dominating 
clique {xo, 21,2} can be found in O(n?) time. 


Observations 11 and 12 will prove Observation 13. A detailed proof of Observa- 
tion 13 is omitted in this paper. 


Theorem 2. Let G be a {Ps, K4}-free chordal graph. Then, Xca(G) = w.(G). 
Also, the cd-chromatic number and the subclique number of G can be found in 
O(n?) time. 


Proof. Let G be a connected { P;, K4}-free chordal graph. Then, w(G) < 3. We 
prove the theorem in the following three cases. 
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Case 1: w(G) =1. 
Then, G & Ky and xXea(G) = ws(G) = 1. This can be found in constant time. 


Case 2: w(G) = 2. 
Then, by Observation 2, G is a Ps-free tree and \-a(G) = ws(G) = 2. This can 
be found in constant time. 


Case 3: w(G) = 3. 
By Lemma 2, G admits a minimal dominating clique, say C’, and it can be 
obtained in O(n?) time by Corollary 1. Then, the following statements are true. 


1. If |C| = 1, then yeaq(G) = ws(G) = 3 by Observation 6. This result can be 
obtained in constant time. 

2. If |C| = 2, then xea(G) = ws(G) by Lemma 3. Also, yea(G) and w.(G) can 
be found in O(n?) time by Observation 8. 

3. If |C| = 3, then xeq(G) = ws(G) by Lemma 4. Also, yceaq(G) and ws(G) can 
be found in O(n?) time by Observation 13. 


Thus, from the above cases, we can conclude the following. 


1. For a connected {P5, K4}-free chordal graph G, Xca(G) = ws(G). 
2. The cd-chromatic number and the subclique number of a connected {P;, K4}- 
free chordal graph G can be found in O(n?) time. 


We know that for a disconnected graph H with connected components 
Gi, ---Gk Xea(H) = e_, Xca(Gi) and w,(H) = *_, w.(Gi). Thus, the above 
results hold for the class of {P;, &4}-free chordal graphs. 


4 Ps-free Chordal Bipartite Graphs 


A bipartite graph G is said to be chordal bipartite if every induced cycle of length 
at least six has a chord in it. Clearly, a chordal bipartite graph G is C;,,-free for 
every n # 4. It is known that in the class Pg-free trees, w, —w can be arbitrarily 
large [12]. In this section, we prove that ycq(G) = ws(G) when G is a P-free 
chordal bipartite graph. In addition, we show that a maximum subclique of a 
Pg-free chordal bipartite graph can be found in O(n?) time. 


Theorem 3. /10] A graph G is {Ps, Ce, K3}-free if and only if every connected 
induced subgraph of G has a dominating set that induces a complete bipartite 
subgraph. 


Theorem 4. Let G be a connected P¢-free chordal bipartite graph. Then, 
Xea(G) = ws(G). 


Proof. Clearly, G is a { Ps, Ce, K3}-free graph. Then, by Theorem 3, G has a dom- 
inating set which induces a complete bipartite subgraph. Let D = {21,..., 2%, y1, 

..,y} be such a set of minimum cardinality. Let D = X’ UY’ be a parti- 
tion of D into independent sets X’ = {2,...,v,} and Y’ = {yi,...,y}. Let 
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X = Ujei N(yj) and Y = UL, N(ai). Clearly, X’ C X and Y’ C Y. Since 
G is triangle-free, X NY = 0. Thus, X UY is a partition of V(G). We show 
that X is independent set. If not, let w,z € X such that wz € E(G). Since 
X= Wey N(y;), there exist p,q € {1,...,1} such that y,w, ygz € E(G). Then, 
either {Yp, W, Z,Yq, C1} induces a C's or one of its subsets induce a K3 in G, a 
contradiction. Thus, X is an independent set. Similarly, we can show that Y is 
an independent set. Since D is a minimum dominating biclique of G, for every 
vertex x; (1 < i < k), there exists a vertex u; € Y such that x;u; € E(G) 
and tu; ¢ E(G) for every a € {1,...,k} \ {i}. Similarly, for every vertex y; 
(1 <j <1), there exists a vertex v; € X such that y;v; € E(G) and ygv,; ¢ E(G) 
for every @ € {1,...,J} \ {7}. Note that u;’s and v,’s are pairwise distinct. Let 
S = {uy,..., Up, U1,---, vr}. Clearly, |.S'| = & +1. The following claims prove that 
S is a subclique. 


Claim 1: d(ui,u2) # 2. 

By the definition of u,’s, 21u1,%2u2 € E(G) and wou1,21u2 ¢ E(G). Contrary 
to Claim 1, assume that d(u1, uz) = 2. Then, there exists a vertex a in V(G) 
such that au; and aug are edges in G. Since u; € Y and Y is an independent 
set,a ¢ Y. We claim that a ¢ X’. If not, let a = 2; for some i = 1,...,k. Then, 
XjU1,U;U2 © E(G) for some i = 1,...,k. We know that xu, € E(G) and aqui ¢ 
E(G) for a € {2,...,k}. Hence, a = x7). This implies that aug = x,u2 € E(G), 
a contradiction to 71u2 ¢ E(G). Thus, a ¢ X’. Hence, a € X \ X’. 

Next, we show that the sets N(2,) and N(a#2) are dominated by a. On 
the contrary, suppose that there exists a vertex u € N(2x1) \ {ui} such 
that au ¢ E(G). Then, {u, 21, u1, a, u2,v2} induces Ps or Cg in G (because 
UU], UA, UU, V1A, L1Ug, V1 LQ, UU, U1X2,4X_q are not edges in G). This implies 
that every vertex in N(a1) is adjacent to a. Similarly, we can show that every 
vertex in N(22) is adjacent to a. Thus, a dominates the set N(x,)U N(x) DY’. 
This implies that (D \ {11,x2}) U {a} is a biclique of size k +1 —1 dominating 
V(G), a contradiction to the fact that D is a minimum dominating biclique. 
Therefore, d(ui, uz) # 2. 


Similarly, we can prove that for 1 < i; < ig < k, d(uj,,ui,) A 2, and for 
1 < yy < ja S l, El tig 35529) # 2. 


Claim 2: d(ui,v1) 4 2. 

Clearly, uw. € Y and vw; € X. Contrary to Claim 2, assume that d(ui,v1) = 2. 
Thus, wivi ¢ E(G). In addition, there exists a vertex a in V(G) such that 
au,,av, are edges in G. Then, a € X UY = V(G) because X and Y are inde- 
pendent sets, a contradiction. This proves Claim 2. 


Similarly, we can prove that d(u;,v;) #2 for 1 <i<kand1<j<l. 


By Claims 1 and 2, it is evident that S = {u1,...,uz,v1,---,v,} is a subclique 
of size k +1 in G. Therefore, w.(G) > k +1. 
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Next, we produce a (k + !)-cd-coloring of G. Let U; = N(x), and U; = N(2;) \ 
Un N(aq) for 2 <i < k. Clearly, u; € U; and each U; is dominated by 
the vertex xz; for 1 < i < k. Similarly, define Vj = N(y1), and V; = N(y;) \ 
UE N(yg) for 2 < j < 1. Clearly, v; € Vj; and each V; is dominated by 
the vertex y; for 1 < j < I. Also, since D dominates V(G), each vertex in 
Y belongs to U; for some 7 and each vertex in X belongs to V; for some J. 
Thus, V(G) = (Uj_1 Ui) U(Uj=1 Vj) is a (& + 1)-cd-coloring of G. This implies 
that yea(G) < k +1. Thus, we have, k +1 < w.(G) < Xea(G) < k +1. Hence, 
Xea(G) = w,(G) =K+1. 


Remark 1. 


1. A cycle on five vertices (Cs) is a {P,Ce, K3}-free graph with w,(Cs) = 2 
and Xca(C5) = 3. Since Xea(Cs) # ws(Cs), the above theorem (Theorem 4) 
on P¢-free chordal bipartite graphs cannot be extended to {P., Ce, K3}-free 
graphs. 

2. Theorem 4 on P¢-free chordal bipartite graphs cannot be extended to P7-free 
chordal bipartite graphs. A P7-free chordal bipartite graph G with x-a(G) = 4 
and w,.(G) = 2 is shown in Fig. 2. 


Lemma 5. [4] The total dominating set problem in chordal bipartite graphs can 
be solved in O(n?) time. 


Theorem 5. /11] The cd-chromatic number of a triangle-free graph is equal to 
its total domination number. 


Fig. 2. A P;-free chordal bipatite graph G where {k,1,a,d,g,h} induces a Ps. Here, 
Xcea(G) = 7¥,(G) = 4 (see Theorem 5). Also, ws(G) = 2: choose any two vertices 
x,y € V(G) such that d(x, y) 4 2, then for any other vertex z € V(G), either d(x, z) = 2 
or d(y,z) = 2. 


Remark 2. If D is a total dominating set of size y,(G) in a triangle-free graph 
G, then Merouane et al.’s algorithm [11] produces a ¥, (G)-cd-coloring in O(n+m) 
time. Thus, by Lemma 5 and Theorem 5, an optimal cd-coloring of a chordal 
bipartite graph can be found in O(n?) time using Merouane et al.’s algorithm. 


Lemma 6. Let G be a connected Pg-free chordal bipartite graph. Then, every 
total dominating set in G of size y,(G) is a biclique. 
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Proof of Lemma 6 is omitted in this paper. 


Corollary 2. A mazimum subclique of a Pg-free chordal bipartite graph can be 
found in O(n?) time. 


Proof. Let G be a connected P,-free chordal bipartite graph. Then, by Lemmas 5 
and 6, a biclique D = {21,...,2%,Y1,---, yw} of size y,(G) dominating V(G) can 
be found in O(n?) time (here, y,(G) = k+1). Since D is a minimum dominating 
biclique of G, for every vertex x; (1 <i <k), there exists a vertex u; € Y such 
that xu; € E(G) and xu; ¢ E(G) for every a € {1,...,k} \ {i}. Similarly, for 
every vertex y; (1 <j <1), there exists a vertex v; € X such that y;v; € E(G) 
and ygv; ¢ E(G) for every @ € {1,...,1}\{j}. It is proved in Theorem 4 that the 


set S = {u1,...,Ug,U1,---,} is a maximum subclique in G. The set S can be 
found by choosing exactly one vertex from each set X; = N(xi)\ U N(aa) 
1<a<k 
; we 
fori € {1,...,k} and Y; = N(yj)\ U N(azg) for j € {1,..., 0}. A set X; 
1<A<l 
BHI 


(respectively Y;) can be found in O(n”) time and at most n such sets are found. 
Thus, a maximum subclique of G can be found in O(n?) time. 

From the above result, we can conclude that a maximum subclique in a 
P¢-free chordal bipartite graph can be found in O(n?) time because w,(H) = 


Sy ws(G;) for a graph H with connected components Gj,...,Gp. 


5 Conclusion 


In this paper, we proved that the cd-chromatic number and the subclique number 
are equal for the class of {P5, &4}-free chordal graphs and the class of Pg-free 
chordal bipartite graphs. In addition, we proved that the cd-chromatic number 
and the subclique number of a {Ps, K4}-free chordal graph can be found in 
O(n?) time. Also, a maximum subclique in a P¢-free chordal bipartite graph can 
be found in O(n?) time. 
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Abstract. First, we present a new algorithm for the single-source shortest paths 
problem (SSSP) in edge-weighted directed graphs, with n vertices, m edges, and 
both positive and negative real edge weights. Given a positive integer parameter 
t, in O(tm) time the algorithm finds for each vertex vu a path distance from the 
source to v not exceeding that yielded by the shortest path from the source to 
v among the so called t+light paths. A directed path between two vertices is 
t+light if it contains at most t more edges than the minimum edge-cardinality 
directed path between these vertices. For ¢ = O(n), our algorithm yields an 
O(nm)-time solution to SSSP in directed graphs with real edge weights matching 
that of Bellman and Ford. 

Our main contribution is a new, output-sensitive algorithm for the all-pairs 
shortest paths problem (APSP) in directed acyclic graphs (DAGs) with positive 
and negative real edge weights. The running time of the algorithm depends on 
such parameters as the number of leaves in (lexicographically first) shortest-paths 
trees, and the in-degrees in the input graph. If the trees are sufficiently thin on the 
average, the algorithm is substantially faster than the best known algorithm. 

Finally, we discuss an extension of hypothetical improved upper time-bounds 
for APSP in non-negatively edge-weighted DAGs to include directed graphs with 
a polynomial number of large directed cycles. 


1 Introduction 


The length of a path in an edge-weighted graph is the sum of the weights of edges on 
the path. A shortest path between two vertices in a graph has minimal length among 
all paths between these vertices. The distance between vertices v and wu is the length of 
a shortest path from v to wu. If the graph is directed, the paths are supposed to be also 


directed. 


Shortest path problems, in particular the single-source shortest paths problem 
(SSSP) and the all-pairs shortest paths problem (APSP), belong to the most basic and 
important problems in graph algorithms [5,17]. There are several variants of SSSP and 
APSP depending among other things on the restrictions on edge weights and the input 
graphs. The input to these problems is a directed or an undirected edge-weighted graph. 
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The output is a representation of shortest paths between the source and all other vertices 
or between all pairs of vertices in the graph, respectively. 

In the general case of directed graphs (without negative cycles), when both posi- 
tive and negative real edge weights are allowed, the difference between the best known 
asymptotic upper time-bounds for SSSP and APSP respectively is surprisingly small. 
Namely, if the input directed graph has n vertices and m edges with real weights, then 
the best known SSSP algorithm due to Bellman [3], Ford [7], and Moore [12] runs in 
O(nm) while the APSP can be solved already in O(nm + n? log n) time [11,17]. The 
APSP solution uses Johnson’s O(nm)-time reduction of the general edge weight case 
to the non-negative edge case and then it runs Dijkstra’s algorithm [6] n times [11,17]. 
The latter upper time-bound for APSP with arbitrary real edge-weights has been more 
recently improved to O(nm + n? log log n) by Pettie in [14]. Note that the aforemen- 
tioned best asymptotic upper time bounds for SSSP and APSP are different only for 
sparse graphs with o(n log log n) edges. Interestingly, when edge weights are integers, 
the best known upper time-bound for APSP just in terms of n is n3/2° (vlogn) [4], 

The situation alters dramatically when the input directed graph is acyclic, i.e., when 
it does not contain directed cycles. Then, a simple dynamic programming algorithm 
processing vertices in a topologically sorted order solves the SSSP problem in O(n+m) 
time [5], an O(n(n + m))-time solution to the APSP problem in this case follows. 

In fact, Yen could use the aforementioned method for SSSP in DAGs iteratively in 
order to improve the time complexity of Bellman-Ford algorithm for directed graphs by 
a constant factor [15]. Bellman-Ford algorithm runs in n—1 iterations. In each iteration, 
for each edge e, the current distance (from the source) at the head of e is compared to 
the sum of the current distance at the tail of e and the weight if e. If the sum is smaller 
the distance at the head of e is updated. To achieve the improvement, Yen imposes a 
linear order on the vertices of the input directed graph which yields a decomposition of 
the graph into two DAGs. Next, the SSSP method for DAGs is run on each of the two 
DAGs instead of an iteration of Bellman-Ford algorithm [15]. Bannister and Eppstein 
obtained a further improvement of the time complexity of Bellman-Ford algorithm by 
a constant factor using a random linear order [2]. 

A pair of vertices in an edge weighted undirected or directed graph can be connected 
by several paths, in particular several shortest paths. Beside the length of a path, the 
number of edges forming it can be an important characteristic. For example, Zwick 
provided several exact and approximation algorithms for all pairs lightest (i.e., having 
minimal number of edges) shortest paths in directed graphs with restricted edge weights 
in [18]. 

In this paper, first we consider t+light paths, i.e., directed paths that have at most 
t more edges than the paths with the same endpoints having the minimal number of 
edges. In part following [15], we iterate O(t) times the SSSP method for DAGs on two 
implicit DAGs yielded by an extension of the BFS partial order to a linear order. The 
iterations alternatively process the vertices in a breadth-first sorted order and the reverse 
order. In result, we obtain path distances from the source to all other vertices that are 
not greater than the corresponding shortest-path distances for t+light paths. It takes 
O(tm) time totally. For t = n — 2, our method matches that of Bellman-Ford for SSSP 
in directed graphs with real edge weights. 
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A vertex v is an ancestor (a direct ancestor, respectively) of a vertex u in a DAG if 
there is a directed path (edge, respectively) from v to u in the DAG. 

Our main result is a new, output-sensitive algorithm for the APSP problem in DAGs. 
It runs in time O(min{n”,nm+n? logn}+)°,,<y indeg(v)|leaf(T,,)|), where n is the 
number of vertices, m is the number of edges, w is the exponent of fast n x nm matrix 
multiplication!, indeg(v) stands for the indegree of v, T, is a tree of lexicographically- 
first shortest directed paths from all ancestors of v to v, leaf(T,,) is the set of leaves 
in T,, and for a set X, |X| stands for its size. Note that if JT, is a path the term 
O(indeg(v) |leaf(T,,)|) equals O(indeg(v)) while when 7, is a star with v as a sink 
the term becomes O(indeg(v)|T,,|). Thus, the running time of the APSP algorithm can 
be so low as O(n“) and so high as O(n” + nm). It follows also that if a is defined by 
maxyey |leaf(T,,)| = O(n) then the algorithm runs in O(n” + mn“) time. Similarly, 
if 6 is defined by vey ul = O(n”) then the algorithm runs in O(n” + n?+°) 
time. 

Finally, we provide an extension of hypothetical, improved upper time-bounds for 
APSP in DAGs with non-negative edge weights to include directed graphs with a poly- 
nomial number of large directed cycles. 

In the full version [9], we additionally present experimental comparisons of our 
SSSP algorithm with the Bellman-Ford one. They show that our SSSP algorithm con- 
verges to the true shortest-path distances on dense edge-weighted pseudorandom graphs 
faster than the Bellman-Ford algorithm does. 


1.1. Paper Organization 


In the next section, we provide our solution to the SSSP problem in directed graphs 
with real edge weights based on the SSSP method for DAGs and the BFS partial order 
in terms of t+light paths. Section 3 is devoted to our output-sensitive algorithm for the 
APSP problem in DAGs with real edge weights and its analysis. In Sect. 4, we discuss 
the extension of hypothetical, improved bounds for APSP in DAGs with non-negatively 
weighted edges to directed graphs with a polynomial number of large directed cycles. 
We conclude with Final remarks. A description of our experimental results can be found 
in the full version [9]. 


2 AnApplication of DAG SSSP Method to Arbitrary Digraphs 


The SSSP problem for directed acyclic graphs can be solved by topologically sort- 
ing the DAG vertices and applying straightforward dynamic programming. For con- 
secutive vertices v in the sorted order, the distance dist(v) of v from the source is 
set to the minimum of dist(u) + weight(u,v) over all direct ancestors u of v, where 
weight(u, v) stands for the weight of the edge (u,v). It takes linear (in the size of the 
DAG) time. Yen used the dynamic programming method iteratively to improve the time 
complexity of Bellman-Ford algorithm for directed graphs by a constant factor in [15]. 
Interestingly, we can similarly apply this method iteratively to determine shortest-path 


' Ww is not greater than 2.3729 [1]. 
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distances among paths using almost the minimal number of edges. To formulate our 
algorithm (Algorithm 1), we need the following definition and two procedures. 


Definition 1. A directed path from a vertex u to a vertex v in a directed graph is lightest 
if it consists of the smallest possible number of edges. A path from u to v is t+light if it 
includes at most t more edges than a lightest path from wu to v. 


procedure SSS PDAG(G, D) 

Input: A directed graph (V,) with real edge weights, linearly ordered vertices 
V1, ++.) Un, and a 1-dimensional table D of size n with upper bounds on the distances 
from v to all vertices in V. 

Output: Improved upper bounds on the shortest-path distances from vj to all vertices in 
V in the table D. 

for j = 2,...,.n do 

For each edge (v;,v;) where i < j 

D(v;) — min{ D(v;), D(v;) + weight(v;, v;)} 

procedure reverseSSSPDAG(G, D) 

Input and output: the same as in SSS PDAG(G, D) 

for j = n-1,...,1 do 

For each edge (v;,v;) where i > j 

D(v;) — min{D(v,;), D(v;) + weight(v;, v;)} 


Algorithm 1 

Input: A directed graph (V, E) with n vertices, real edge weights and a distinguished 
source vertex s, and a positive integer t. 

Output: Upper bounds on the shortest-path distances from s to all other vertices in V 
not exceeding the corresponding shortest-path distances constrained to t+light paths. 


1. Run BFS from the source s. 

2. Order the vertices of G extending the BFS partial order according to the levels of the 
BFS tree, i.e., s comes first, then the vertices reachable by direct edges from s, then 
the vertices reachable by paths composed of two edges and so on. We may assume 
w.Lo.g. that all vertices are reachable from s or alternatively extend the aforemen- 
tioned order with the non-reachable vertices arbitrarily. 

3. Initialize a 1-dimensional table D of size n, setting D(v;) — 0 and D(v;) — oo 
forl<j<n 

4. SSSPDAG(G, D) 

5. fork = 1,...,t do 
(a) reverseSSSPDAG(G, D) 

(b) SSS PDAG(G, D) 


Theorem 1. Let G be a directed graph with n vertices, m real-weighted edges, and a 
distinguished source vertex s. For all vertices v of G different from s, an upper bound 
on their distance from the source vertex s, not exceeding the length of a shortest path 
among t+light paths from s to v, can be computed in O((t + 1)(m + n)) total time. 
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Proof. Consider Algorithm | and in particular the ordering of the vertices specified in 
its second step. We shall refer to an edge (v;,v,;) as forward if i < j otherwise we 
shall call it backward. Note that the vertices at the same level of the BFS tree can be 
connected both by forward as well as backward edges. See also Fig. 1. Let @ be the 
number of (forward) edges in a lightest path from s to a given vertex v. It follows that 
any path from s to v, in particular a shortest t+-light one, has to have at least ¢ forward 
edges. 

To see this, consider the BFS tree from the source s. Define the level of a vertex in 
the tree as the number of edges on the path from s to the vertex in the tree. Thus, in 
particular, level(s) = 0 while level(v) = ¢. Recall that the linear order extending the 
partial BFS order used in Algorithm | is non-decreasing with respect of the levels of 
vertices. Also, if (u, w) is a forward edge then level(u) < level(w) < level(u) + 1 
and if (uw, w) is a backward edge then level(u) > level(w). Hence, any path from s to 
v has to have at least £ forward edges, each increasing the level by one. 

Consequently, a shortest t-+-light path from s to v can have at most t backward edges. 
Thus, it can be decomposed into at most 2¢+ 1 maximal fragments of consecutive edges 
of the same type (i.e., forward or backward, respectively), where the even numbered 
fragments consist of backward edges. Thus, the at most 2t + 1 calls of the procedures 
SSSPDAG(G, s, D), reverseSSSPDAG(G, s, D) in the algorithm are sufficient to 
detect a distance from s to v not exceeding the length of a shortest path among ¢+light 
paths from s to v. The asymptotic running time of the algorithm is dominated by the 
aforementioned procedure calls. Hence, it is O((t + 1)(m+4n)). 


Fig. 1. An example of a graph with a BFS vertex numbering and the two DAGs implied by forward 
and backward edges, respectively. 
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We can obtain a representation of directed paths achieving the upper bounds on the 
distances from the source provided in Theorem | in a form of a tree of paths emanating 
from the source by backtracking. By setting t = n — 2 in this theorem, we can match 
the best known SSSP algorithm for directed graphs with positive and negative real edge 
weights, i.e., the Bellman-Ford algorithm and its constant factor improvements [11, 
17], running in O(nm) time. Similarly as in the case of Bellman-Ford algorithm, by 
calling additionally reverseSSSPDAG(G, D) and SSSPDAG(G, D) after the last 
iteration in Algorithm 1, we can detect the existence of negative cycles. 

Comparing our algorithm with the Bellman-Ford one, note that if the lightest path 
from the source to a vertex uv has & edges then @ + ¢ iterations in the Bellman-Ford 
algorithm may be needed to obtain an upper bound on the distance of v from the source 
comparable to that obtained after O(t) iterations in Algorithm 1. 


3 An Output-Sensitive APSP Algorithm for DAGs 


The APSP problem in DAGs with both positive and negative real edge weights can 
be solved in O(n(n + m)) time by running n times the SSSP algorithm for DAGs. It 
is an intriguing open problem if there exist substantially more efficient algorithms for 
APSP in edge-weighted DAGs. In this section, we make a progress on this question 
by providing an output-sensitive algorithm for this problem. Its running time depends 
on the structure of shortest path trees. Although in the worst-case it does not break the 
O(nm) barrier it seems to be substantially more efficient in the majority of cases. 

The standard algorithm for APSP for DAGs just runs the SSSP algorithm for DAGs 
for each vertex of the DAG as a source separately. Our APSP algorithm (Algorithm 
2) does everything in one sweep along the topologically sorted order. Its main idea 
is for each vertex v to compute the tree of lexicographically-first shortest paths from 
the ancestors u of the currently processed vertex v to v, in the topologically sorted 
order. For each ancestor u of v, Algorithm 2 proceeds as follows. In case the tree 
of lexicographically-first shortest paths from the already considered ancestors of vu 
includes u (as some intermediate vertex) then the algorithm is done as for wu. Other- 
wise, Algorithm 2 finds the direct ancestor of v on the lexicographically-first shortest 
path P from u to v and adds an initial fragment of P to the tree. By the topologically 
sorted order in which the ancestors wu of v are considered, the latter situation can happen 
only when wu is a leaf of the (final) tree of lexicographically-first shortest paths from the 
ancestors of v to v. Algorithm 2 finds the direct ancestor of v on P by comparing the 
lengths of shortest paths from u to v with different direct ancestors of v as the next 
to the last vertex on the paths in time proportional to the indegree of v. It also finds 
the initial fragment of P to add by using the link to the lexicographically-first shortest 
path from u to the direct ancestor of v that is on P. The correctness of the algorithm 
is immediate. The issues are an implementation of these steps and an estimation of the 
running time. 

To specify our output-sensitive algorithm (Algorithm 2) more exactly, we need the 
following definition. 


Definition 2. Assume a numbering of vertices in an edge-weighted DAG extending the 
topological partial order. A shortest (directed) path P from vx, to v; in the DAG is first 
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in a lexicographic order if the direct ancestor v; of v; on P has the lowest number 3 
among all direct ancestors of vi; on shortest paths from vx, to vi; and the subpath of 
P from vz to v; is the lexicographically-first shortest path from v;, to v;. For a vertex 
vu; in the DAG, the tree T,, of (lexicographically-first) shortest paths is the union of 
lexicographically-first paths from all ancestors of v; to v;. Note that the vertex v; is a 
sink of Ty,. It is assumed to be the root of Ty, and leaf(T,,) stands for the set of leaves 
of Ty, 


Algorithm 2 

Input: A DAG (V, £) with real edge weights. 

Output: For each vertex v € V, the tree T), of lexicographically-first shortest paths from 
all ancestors of v to v given by the table NEXT), where for each ancestor u of v, 
NEXT,(wu) is the direct successor of u in the tree T,,, (i.e., the head of the unique 
directed edge having wu as the tail in the tree). 


1. Determine the source vertices, topologically sort the remaining vertices in V, and 
number the vertices in V accordingly, assigning to the sources the lowest numbers. 
2. Set n to |V| and r to the number of sources in G. 
3. Initialize an n x n table dist by setting dist(u,u) = 0 and dist(u,v) = oo for 
u,vEV, ux. 
4. fori =r+1,...,ndo 
(a) Compute the set A(v,;) of ancestors of v;. 
(b) Initialize a 1-dimensional table NEXT, of size |A(v;)|, 
setting NEXT v;(v;) to 0 for v; € A(v;). 
(c) for vy € A(v;) in increasing order of the index k do 
i. if NEXT,,(v,) 4 0 then proceed to the next iteration of the interior for 
block. 

ii. Determine a direct ancestor v; of v; that minimizes the value of 
dist(vz, v;) + weight(v;, v;). In case of ties the vertex v; with the smallest 
index 7 is chosen among those yielding the minimum. 

ill. Veurrent — Uk 

iv. while vcurrent # Vj \ NEXT (Vcurrent, vi) = 0 do 
dist(Veurrent, Vi) — dist(Veurrent, Uj) + weight(v;, vi) 

NEXT,, (Voirrent) — NEXT), (Veurrent) 
Veurrent NEXT,, (Gaiprent) 

v. if NEXT,,(v;) = 0 then dist(vj;,v;) <— weigh(v;,vi) A 

NEXT,,(v;) — ui 


Lemma 1. Steps 4.c.iii-v add the missing fragments of a lexicographically shortest 
path from v;, to v; and set the distances from vertices in the fragments to v; in time 
proportional to the number of vertices added to Ty,. 


Proof. Follow the path from v; to v; in T,,, extended by (vj, v;) until a vertex vg € Ty, 
is encountered. This is done in Steps 4.c.iii-v. The membership of Veurrent in Ty, iS 
verified by checking whether or not NEXT), (Veurrent) = 0. Also, if Veurrent iS not 
yet in T,,, then its distance to v; is set by dist(Veurrent, Vi) — dist(Veurrent, Vj) + 
weight(v;,v;) and it is added to T,, by NEXTy, (Vcurrent) — NEXT, (Vcurrent) in 
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Step 4.c.iv. By the inclusion of vg in T,,, a whole shortest path @ from vg to v; is already 
included in T),, by induction on the number of steps performed by the algorithm. We 
claim that Q exactly overlaps with the final fragment of the extended path starting from 
Uq. To see this encode Q and the aforementioned fragment of the extended path by the 
indices of their vertices in the reverse order. By our rule of resolving ties in Step 4.c.ii 
both encodings should be first in the lexicographic order so we have an exact overlap. 
For this reason, it is sufficient to add the initial fragment of the extended path ending at 
vq to T,,, and if necessary also the edge (v;, v;) to T,,,, and to update the distances from 
vertices in the added fragment to vj, i.e., to perform Steps 4.c.1i11-v. 


Theorem 2. The APSP algorithm for a DAG (V,E) with n vertices, m edges 
and real edge weights (Algorithm 2) runs in time O(min{n”,nm + n?logn} + 


Dvev indeg(v)|leaf(T. )|)- 


Proof. The sets of ancestors can be determined in Step 4.a by computing the transitive 
closure of the input DAG in O(min{n”, nm}) time by using fast matrix multiplication 
[13] or BFS [5], first. In fact, to implement the loop in Step4.c, we need the sets of 
ancestors to be ordered according to the numbering of vertices provided in Step 1. If 
the transitive closure matrix is computed such an ordered set of ancestors can be easily 
retrieved in O(n) time. Otherwise, additional preprocessing sorting the unordered sets 
of ancestors is needed. The total cost of the additional preprocessing is O(n? log n). 
All the remaining steps, excluding Steps 4.c.ii-v for vertices v;, not yet in T,,, can 
be done in total (i.e., over all iterations) time O()>,,<y(1 + |A(v)|)) = O(n”), where 
A(v) stands for the set of ancestors of v in the DAG. The time taken by Step 4.c.ii, 
when vz; is not yet in the current T,,,, is O(indeg(v;)). Suppose that v;, is not a leaf of 
the final tree T,,,. Then, there must exist some leaf vu, of the final tree such that there is 
path from v, via vz, to v; in this tree. By the numbering of vertices extending the partial 
topological order, we have p < k. We infer that the aforementioned path is already 
present in the current T,,. Thus, in particular the vertex v, is in the current tree. Hence, 
the total time taken by Step 4.c.ii is O()°,,-y indeg(v) |leaf(77, )|). Finally, the total time 
taken by Steps 4.c.iii-v is O()) ,<y(1 + |A(v)|)) by Lemma 1. 


Note that the following inequalities hold: 


y indeg(v)|leaf(T, v)| S mmax |leaf(T.)], 
VE 
vEeV 


2 Lev Heaf(Ty)] 


nr 


S- indeg(v)|leaf(T,,)| <n 
vEeV 


They immediately yield the following corollary from Theorem 2. 


Corollary 1. Let G = (V, E) be an n-vertex DAG with n vertices and m edges with 
real edge weights. Suppose Maxyev |leaf(T,)| = O(n®) and 


Lev VeaTo)l O(n’). The APSP problem for G is solved by Algorithm 2 in time 


n 


O(min{n’, nm +n? logn} + min{mn®, n?+9}), 
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Observe that |leaf(T),)| is equal to the minimum number of directed paths covering 
the tree T;,. Hence, a < 1 if the maximum of the minimum number of paths covering 
T, over v is substantially sublinear. Similarly, @ < 1 if the average of the minimum 
number of paths covering T,, over v is substantially sublinear. 

To illustrate the superiority of Algorithm 2 over the standard O(n(n + m))-time 
method for APSP in DAGs, consider the following simple, extreme example. 

Suppose is a positive integer. Let D be a DAG with vertices v1, va,...,Un, and edges 
(v;,u;), where i < j, such that the weight of (v;, v;) is —lifj = i+1 and M otherwise. 
It is easy to see the tree T,, is just the path v1, v2,...,v; and hence |leaf(T,,)| = 
1. Consequently, Algorithm 2 on the DAG D runs in O(n”) time while the standard 
method requires O(n°) time. If M = 1, one could also run Zwick’s APSP algorithm 
for directed graphs with edge weights in {—1,0, 1} on this example in O(n?->”°) time 
[16]. 

For a refinement of Theorem 2, see the full version [9]. 


4 A Potential Extension to Digraphs with Large Cycles 


As we have already noted the APSP problem in DAGs with both positive and negative 
real edge weights can be solved in O(n(n+m)) time. It is also an interesting open prob- 
lem if one can derive substantially more efficient algorithms for APSP in DAGs than 
the O(n(n + m))-time method in case of restricted edge weights, e.g., non-negative 
edge weights etc. In this section, under the assumption of the existence of such substan- 
tially more efficient algorithms for DAGs with non-negative edge weights, we show that 
they could be extended to include directed graphs having a polynomial number of large 
cycles. 

The idea of the extension is fairly simple, see Fig. 2. We pick uniformly at random a 
sample of vertices of the input directed graph that hits all the directed cycles with high 
probability (cf. [16]). Here, we use the assumption on the minimum size of the cycles 
and on the polynomially bounded number of the cycles. Next, we remove the vertices 
belonging to the sample and run the hypothetical fast algorithm for APSP in DAGs on 
the resulting subgraph of the input graph which is acyclic with high probability. In order 
to take into account shortest path connections using the removed vertices, we run the 
Dijkstra’s SSSP algorithm from each vertex in the sample on the original input graph 
two times. In the second run we reverse the directions of the edges in the input graph. 
Finally, we update the shortest path distances appropriately. 


Algorithm 3 

Input: A directed graph (V, £) with n vertices, m non-negatively weighted edges and 
a polynomial number of directed cycles, each with at least d vertices. 

Output: The shortest-path distances for all ordered pairs of vertices in V. 


1. Initialize an n x n array D by setting all its entries outside the main diagonal to +-oo 
and those on the diagonal to zero. 

2. Uniformly at random pick a sample S of O(n In n/d) vertices from V. 

3. Run the hypothetical APSP algorithm for DAGs on the graph 
(V\S, EM {(u,v)|u,v € V \ S}) and for each pair u, vu € V \ S, set D(u, v) to 
the distance determined by the algorithm. 
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Fig. 2. An example of a directed cycle that can be broken by removing the encircled vertex 
belonging to the sample. To find shortest-path connections passing through this vertex two SSSP 
from it are performed, in the original and the reversed edge directions, respectively. 


4. For each s € S, run the Dijkstra’s SSSP algorithm with s as the source in (V, E) 
and for all v € V \ {s} update the D(s, v) entries respectively. 

5. For each s € S, run the Dijkstra’s SSSP algorithm with s as the source on the 
directed graph resulting from reversing the directions of the edges in (V, £), and for 
all v € V \ {s} update the D(v, s) entries respectively. 

6. Forall pairs u, v of distinct vertices in V\.S, and for all vertices s € S, set D(u,v) = 
min{D(u, v), D(u, s) + D(s,v)}. 


Theorem 3. Let t(n,m) be the time required by APSP in DAGs with n vertices and 
m non-negatively weighted edges. Algorithm 3 solves the APSP problem for a directed 
graph with n vertices, m non-negatively weighted edges and a polynomial number of 
directed cycles, each with at least d vertices, in O(t(n,m) + n° ln n/d) time with high 
probability. 


Proof. Suppose that the number of directed cycles in the input graph (V, £) is O(n‘). 
By picking enough large constant for the expression nln n/d specifying the size of the 
sample S, the probability that a given directed cycle in G is not hit by S can be made 
smaller than n~°—!. Hence, the probability that the graph resulting from removing the 
vertices in S is not acyclic becomes smaller than n~'. It follows that Algorithm 3 is 
correct with high probability. It remains to estimate its running time. Steps 1, 2 can be 
easily implemented in O(n”) time. Step 3 takes t(n, m) time. Steps 4, 5 can be imple- 
mented in O((n In n/d) x m+n? In? n/d) time [5]. Finally, Step 6 takes O(n? In n/d) 
time. 


Note that because of the term n°Inn/d in the upper time-bound given by 
Theorem 3, the upper bound can be substantially subcubic only when d = (2(n°) for 
some 6 > 0. 


5 Final Remarks 


In the absence of substantial asymptotic improvements to the time complexity of basic 
shortest-path algorithms, often formulated at the end of 50s, like the Bellman-Ford algo- 
rithm and Dijkstra’s algorithm, the results presented in this paper should be of interest. 
Our output-sensitive algorithm for the general APSP problem in DAGs possibly could 
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lead to an improvement of the asymptotic time complexity of this problem in the aver- 
age case. A probabilistic analysis of the number of leaves in the lexicographically-first 
shortest-path trees is an interesting open problem. 

In the vast literature on shortest path problems, there are several examples of output- 
sensitive algorithms. For instance, Karger et al. [8] and McGeoch [10] could orchestrate 
the n runs of Dijkstra’s algorithm in order to solve the APSP problem for directed 
graphs with non-negative edge weights in O(m*n + nlogn) time, where m* is the 
number of (essential) edges that participate in shortest paths. 

Finally, note that DAGs have several important scientific and computational appli- 
cations in among other things scheduling, data processing networks, biology (phylo- 
genetic networks, epidemiology), sociology (citation networks), and data compression. 
For these reasons, efficient algorithms for shortest paths in DAGs are of not only theo- 
retical interest. 
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Abstract. Finding densest subgraphs is a fundamental problem in 
graph mining, with several applications in different fields. In this paper, 
we consider two variants of the problem of covering a graph with k dens- 
est subgraphs, where k > 2. The first variant aims to find a collection of 
k subgraphs of maximum density, the second variant asks for a set of k 
subgraphs such that they maximize an objective function that includes 
the sum of the subgraphs densities and a distance function, in order to 
differentiate the computed subgraphs. We show that the first variant of 
the problem is solvable in polynomial time, for any k > 2. For the sec- 
ond variant, which is NP-hard for k > 3, we present an approximation 
algorithm that achieves a factor of 2. 


1 Introduction 


Identifying cohesive subgraphs is fundamental in graph mining and graph theory. 
In several fields, from social networks analysis [18] to computational biology [9] 
cohesive groups of elements are often related to functionalities of a complex 
system and it is indeed a fundamental task to identify such cohesive subgraphs. 

Several models of cohesive subraphs have been considered in the literature. 
The first model to be studied has been clique [20], that is a complete graph. How- 
ever several alternative definitions of cohesive subgraphs have been introduced 
and studied in the literature (see for example the review in [16]), including dens- 
est subgraph. This latter model asks for a subgraph that maximizes the ratio 
between the number of edges and the number of vertices. Compared to other 
models, densest subgraphs have the advantage that they can be found in poly- 
nomial time [11,12,14] and also approximable in linear time within factor of $ 
(2,5, 15,17]. The computational tractability and the natural definition of density 
have lead to a prominent position in graph mining [1,3,4,6, 10,21, 23,24, 27]. 

The first goal of research in graph mining has been the identification of 
a single cohesive subgraph, with few exceptions, for example the problem of 
covering a graph with cliques [13]. In the last years, the interest has moved to the 
identification of more than a dense subgraph inside a network [4,8, 10, 26], rather 
than a single subgraph. This interest is motivated by new approaches to network 
analysis that require in several cases the identification of the main cohesive 
© Springer Nature Switzerland AG 2022 


N. Balachandran and R. Inkulu (Eds.): CALDAM 2022, LNCS 13179, pp. 152-163, 2022. 
https: //doi.org/10.1007/978-3-030-95018-7_13 


Covering a Graph with Densest Subgraphs 153 


groups of a network, rather than a single subgraph. Indeed, many real-world 
networks contain several cohesive groups that may also share common elements, 
like hubs, that usually belong to many communities [10,19]. Considering the 
densest subgraph model, the proposed approaches in this direction ask for a 
collection of dense subgraphs that may share vertices [4,10,21,25]. While the 
approaches ask for a collection of k > 1 densest subgraphs, they differ in the 
way the overlapping of the subgraphs is handled. Balalau et al. [4] define a hard 
constraint for the overlapping, based on the Jaccard’s index, allowing only a 
fraction of the vertices to be shared by two subgraphs. Galbrun et al. [10] do 
not define a hard constraint on the overlapping of two subgraphs that can be 
identical or very similar, but include a distance function in the objective function 
to be maximized. Both versions of the problem are NP-hard [4,7], the second is 
also known to be approximable within factor 4 [7,10]. 

In this paper, following an approach proposed by Rozenshtein et al. [22] 
related to temporal graphs, we consider the problem of finding a collection of k 
dense subgraphs that cover the input graph. Notice that when k = 1 the problem 
is trivial, as the solution must be the input graph. Hence in the paper we focus 
on the case k > 2 and we consider two variants of the problem. In the first 
variant, the objective function is the sum of the densities of the k subgraphs, 
without any constraints except that they must cover the input graph. For the 
second variant, similar to [7,10], the objective function includes both the sum of 
the densities of the & subgraphs and a distance function, in order to differentiate 
the subgraphs included in a solution. However, notice that the approximation 
algorithms presented in [7,10] cannot be directly be applied to this variant of 
the problem, as they may produce solutions that do not cover the input graph. 

The paper is organized as follows. Next, in Sect.2, we introduce the main 
concepts and we give the formal definitions of the problems we are interested 
into. In Sect.3, we show that the first variant of the problem is solvable in 
polynomial time, for any k > 2, by showing that there exists an optimal solution 
that consists of k — 1 densest subgraphs and the input graph. For the second 
variant, which as we observe in Sect.2 is NP-hard for k > 3, we present an 
approximation algorithm that achieves an approximation factor of 2 in Sect. 4. 
In Sect. 5 we present conclusions and open problems. 

Some of the proofs are omitted due to page constraint. 


2 Definitions 


We consider undirected and unweighted graphs. Given a graph G = (V, EF), the 
density of G, denoted by dens(G), is equal to dens(G) = rt 

Given a graph G = (V,£) and a subset V’ C V, we denote by G[V’] = 
(V’, E’) the subgraph of G induced by V’. Two subgraphs G, = (V1, £1), Gog = 
(V2, Ez) of G = (V, E) are distinct if Vi 4 Vo. 

We are now able to consider the first problem we are interested into, called 
k-Densest Cover Subgraphs. 
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Problem 1. k-Densest Cover Subgraphs 

Input: A graph G = (V, £), an integer k, with 2<k < |V]. 

Output: A collection S = {G, = (V1, E,),...,G@p = (Ve, Ex)} of k subgraphs of 
G such that U*_,V; = V and the profit p(S) of S, 


k 
p(S) = dy dens(G;) 


is maximixed. 


Notice that some subgraphs in S can be identical. 

The second problem we consider asks for a set of k subgraphs that optimize 
an objective function that includes a distance between subgraphs. We start by 
defining the distance we consider. The distance function we define is inspired by 
that proposed in [10], except that it has value in the range [0, 1] and not in range 
(0, 2] (actually the distance function defined in [10] has value in [1,2] when two 
subgraphs are different). 


Definition 1. Given a graph G = (V,E) and two subgraphs G[A], G[B], with 
A,B CV, the distance function d: 2” x 2” — R, between two sets A,B C V 
that induce subgraph G[A] and G[B], respectively, is defined as follows: 


|AN Bi? 
|A||B| 
Notice that 0 < d(A, B) < 1. Indeed, if G[A] and G[B] are disjoint, that is if 
AN B=9, then d(A, B) = 1, while d(A, B) = 0 if G[A] and G[B] are the same 
subgraph, that is A = B. 
Now, we are able to define the second problem we are interested into, called 
Top k-Cover-Densest Subgraphs. 


d(A,B) =1- 


Problem 2. Top k-Cover-Densest Subgraphs 

Input: A graph G = (V, £), a parameter \ > 0, an integer k, with 2<k < |V]. 
Output: A set S = {Gy = (Vi, F1),...,Ge = (Ve, Ex)} of k pairwise distinct 
subgraphs, with V; C V, 1 <i < k and UV: = V, that maximizes the 
following value 


k k-1 k 
p(S) = 5 dens(Gi)+AS> S~ (Sj, 55). 
i=1 i=1 j=i+1 

Notice that in Top k-Cover-Densest Subgraphs, since S is a set, the subgraphs 
in S must be distinct, unlike in k-Densest Cover Subgraphs. The first term of p(S) 
is called the density profit, while the second them of p(S) is called the distance 
profit. 

In [7] a similar problem, termed Top k-Densest Subgraphs, was proven to 
be NP-hard. Using a similar reduction we can prove the NP-hardness of Top 
k-Cover-Densest Subgraphs problem is NP-hard. 


Corollary 1 (of Theorem 5 [7]). Top k-Cover-Densest Subgraphs problem is 
NP-hard for k > 8. 
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2.1 Algorithms for Densest Subgraph 


The algorithms we will present are based on computing a densest subgraph of a 
graph, that is the Densest Subgraph problem. Given a graph G = (V, E), Densest 
Subgraph can be solved in polynomial time [11,12] via a reduction to Minimum 
Cut, with a complexity of O(|V||E| log |E|) or O(|V|) [14]. 

We also consider the problem of computing a constrained densest subgraph, 
that is a subgraph that is forced to contain a given set V’ of vertices. The problem 
can be solved in polynomial time, with essentially the same time complexity of 
Goldberg’s Algorithm [27]. Given a graph G = (V,£E) and a set U C V, we 
denote by Densest(G, U) a densest subgraph of G that is forced to contain U. If 
U consists of a single vertex u, we abuse the notation and write Densest(G, w). 
If S is a set of subgraphs such that G[U] is not in S, then Densest(G,U,S) is 
Densest(G, U), if this subgraph is not in S, else it is G[U]. 


3. A Polynomial Time Algorithm for the k-Densest Cover 
Subgraphs 


In this section we show a polynomial time algorithm that provides an optimal 
solution to the k-Densest Cover Subgraphs problem. Our algorithm is very simple: 
the optimal solution S = {G, = (Vi, £1),...,Gre = (Ve, Ex)} has Gy = Gg = 
+++ = Gz_1 equal to the densest subgraph, while G; = G, that is the entire input 
graph. In what follows we prove the correctness of the algorithm. 

We first prove an auxiliary lemma regarding the densest subgraph of a graph. 


Lemma 1. Let Gg = (Va, Ea) be a densest subgraph of a graph G = (V,E). 
Let X C Vy, let Gy = (Va — X, Ez) be the subgraph induced by Vz — X and let 
Y = Eq — E,. That is, Y is the set of edges that are removed from the densest 
subgraph after removing the set of vertices X, or, in other words, the edges 
between the vertices of X and the edges with one endpoint in X and another 
endpoint in the set Va — X. It holds that: 


IY] y [Bal 
|X| ~ [Val 
Proof. We know that: 
|Eal |Eal —|¥| 
—— > dens(G;) = ——_— 
[Val |Va| — |X| 


otherwise we can simply remove the set X of vertices and obtain a subgraph 
denser than Gg. Then, we get that: 


|Eal(|Val — |X|) = |Val(|Zal — |Y1) 


—|Ea\|X| => —|VallY| 
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|Ea||X| < |Vall¥| 
|Ea| < ly] 
[Val ~ |X| 


We know that G,,G2,...,G , must cover all the vertices of G. Now, we show 
that there exists an optimal solution for the k-Densest Cover Subgraphs such that 
all the graphs in the solution include a densest subgraph. 


Lemma 2. There exists an optimal solution S = {Gy = (Vi,Fi), Go = 
(V2, Fo), ...,Ge = (Ve, Ex)} to the k-Densest Cover Subgraphs problem that 
has the property that all the graphs G; € S, 1 <i<k, include a densest graph. 


Proof. Assume by contradiction that there exists a densest subgraph Gg = 
(Va, Ba) and a graph G; = (V;, E;) € S such that Vz — V; 4 0. Let X = Vy — V; 
and let Y = Eg — F;. According to Lemma 1: 

1 Lea 


|X| ~ |Val 
By construction, G;U Gg has at least |Y|+|E;| edges and precisely |X|+|V;| 
vertices. We show that: 
[Y| + Ei os |Ei| 
|X| + |Vil ~ [Vil 


Indeed, from Lemma 1 and since Gg is a densest subgraph, it holds that 


IY ss |Eal &. |E;| 
|X| ~ |Val ~ |Vil 


thus \zi| 
VI+IBil  walXl+ lB |B IX|+1Vil _ [Bil 


IX] +|Vil ~ [X|4|Vil [Vi XI 41Vil [Vil 


An immediate consequence of Lemma 2 is the fact that the maximal densest 
subgraph (a densest subgraph of maximum size) is unique. We use this property 
later in the proof of Theorem 1. 


Lemma 3. Given a graph G = (V,E), the mazimal densest subgraph Ga = 
(Va, Ea) in G is unique. 


Proof. Assume by contradiction that there exist two distinct maximal densest 
subgraphs Gg = (Va, Ea) and G', = (Vj, E/). Since Gq and G‘, are distinct, then 
Va — Vi #0 and V; — Va # 0. Thus, we can apply the same argument as in 
Lemma 2 and by taking the union of Vz, V; we obtain another densest subgraph 
that strictly includes both Gg and G’,. Thus, we contradict the fact that Gg and 
G', are maximal and the lemma follows. 
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Lemma 3 shows that there exists a unique maximal densest subgraph. Next, 
we discuss how to compute it in polynomial time. 


1. Compute a densest subgraph Gg = (Va, Ea), using the Goldberg’s algorithm 
of [12] 

2. For each vu € V—Va, by applying the algorithm presented in [27] (see Sect. 2.1), 
compute in polynomial time G’,(v) — Densest(G, ViU{v}), a densest subgraph 
in G that includes Vq U {v}. 

. If no subgraph G',(v) is as dense as Gg, rteturn Gg 

4. Else, for a subgraph G’,(v) that is as dense as Gg, define Gag — G',(v) and 

iterate the algorithm from point 2. 


ew 


From Lemma 2, we can conclude that there exists an optimal solution of k- 
Densest Cover Subgraphs problem where all the subgraphs include the maximal 
densest subgraph of G. 


Corollary 2. There exists an optimal solution S = {G, = (Vi, fi),Go = 
(V2, F2),...,Gk = (Ve, Ex)} to the k-Densest Cover Subgraphs problem that has 
the property that all the graphs G; € S, 1 <i<k, include the maximal densest 
subgraph of G. 


Next, we show a property of two subgraphs in an optimal solution of k- 
Densest Cover Subgraphs, that will be useful us to prove the main result of this 
section. 


Lemma 4. Let Gy) = (Vi, £1) and Gp = (V2, E2) be two graphs in an optimal 
solution of k-Densest Cover Subgraphs problem that satisfies Corollary 2 and let 
Gy = G[V; UV] and Gq be the maximal densest subgraph. Then: 


dens(Ga) + dens(G,) > dens(G1) + dens(G2). 
We now prove the main theorem of this section. 


Theorem 1. There exists an optimal solution S = {G,,Go2,...,G@z} to the k 
Densest Cover Subgraphs problem such that G; = Gg =--- = Gg_1 = Ga, where 
Gq is the maximal densest subgraph and Gy, = G. 


Proof. Let S’ = {G4,G,...,G,} be an optimal solution to the k-Densest Cover 
Subgraphs problem such that G', is not a densest subgraph. We show that we can 
modify S’ so that the resulting solution S = {G1, Go,...,G,} has the following 
properties: (1) G; = Gy =--- = Gg_1 = Ga, where Gq is the maximal densest 
subgraph, and G; = G, (2) dens(S) > dens(S’). 

First, observe that both solutions S and S’ cover all the vertices of G. Now, 
assume that we have two graphs G/ and G‘, in S’ that are not identical to the 
maximal densest subgraph Gy. Then, according to Lemma 4, we can replace G’ 
and G', with the maximal densest subgraph Gq and G/U G4. 

Thus, we can assume that there exists only one graph G/, € S’ such that G, 
is not the maximal densest subgraph of G. Assume without loss of generality 
that #1. G{ UG, =G, as the input graph must be covered. Thus G/, includes 
G— Gq and, according to Corollary 2, G) includes Gg and therefore G; = G and 
the theorem follows. 
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4 An Approximation Algorithm for Top k-Cover-Densest 
Subgraphs 


In this section, we present an approximation algorithm for Top k-Cover-Densest 
Subgraphs. The approximation algorithm outputs the set of k subgraphs having 
largest profit between the sets of k subgraphs computed by two algorithms, called 
Approx-Dens and Dist. Both Approx-Dens and Dist computes a set of k subgraphs 
that cover G, but Approx-Dens aims at maximizing the density of the subgraphs, 
while Dist aims at maximizing the distance between the subgraphs. We start by 
presenting the Approx-Dens algorithm. 


Approx-Dens 


Approx-Dens computes a solution S of Top k-Cover-Densest Subgraphs. We assume 
that k > 3. Indeed, for k = 2, we can compute two distinct subgraphs that cover 
G and that have maximum density by applying the algorithm of Sect.3, as 
described in the following. If we obtain two distinct subgraphs, we know that 
they are two subgraphs of maximum density that cover G. If the algorithm of 
Sect. 3 returns two identical subgraphs G,, Go, then they must be identical to the 
input graph G. Then we can compute in polynomial time a densest subgraph 
of G distinct from G, by applying the modification of Goldberg’s Algorithm 
described in [7]. 

Hence assume k > 3. Approx-Dens first adds to S two densest distinct sub- 
graphs of G, denoted by G, = (Vi, £1) and Gy = (V2, E2). G; and Gz are 
computed in polynomial time as follows: 


— G; = (Vi, £1) is computed by applying Goldberg’s Algorithm [12] on G 
— Gp is computed by applying the modification of Goldberg’s Algorithm 
described in [7], with input G and Vj. 


We assume that |Vi| > |V2| and |V2| > 2. Notice that if this latter condition 
does not hold, |V2| = 1 and dens(G2) = 0, hence we can compute a solution 
of Top k-Cover-Densest Subgraphs by computing any subgraph of G, since it 
contains at least k — 1 subgraphs of density 0. 

Starting with S = {G,,G2}, Approx-Dens computes the remaining subgraphs 
with two phases (described later). The first phase is applied when there exists 
at least one vertex of G that is not covered by S. In this phase, Approx-Dens 
iteratively adds a subgraph G; to S = {G[V\],...G[Vi-a]}, for 3 <i < k-1. 
In what follows, Cj, = Spa V;, that is C;_1 is the set of vertices covered by 
the subgraphs already in S. In the second phase, when C;_; = V, that is the 
input graph is already covered, Approx-Dens adds k — i+ 1 subgraphs, which 
are computed depending on the size of G,. If Phase 2 is never executed, then 
Approx-Dens adds subgraph G;, = G to S. 


Phase 1. Let S = {G[Y],...G[Vi_a]}. If there exists u € V — Ci-1, Gi <— 
Densest(G, wu). 
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Phase 2. If C;_, = V, then: 


—If [Vil < 2logs|V|, Gi,...,G, are k — i distinct subgraphs among 
Densest(G, V; U {u},S) and Densest(G, V2 U {u},S), with ue V-VYy— Va. 
Notice that the algorithm computes, and possibly adds to S, these subgraphs 
based on some ordering of the vertices in V — V; — Vo: it starts with the first 
vertex of V — V, — V2, then the second one, and so on until S contains k 
subgraphs. 

— If 2log,|V| < |Vi| < |V| — log, |V|, Gi,...,G_ are k — i densest distinct 
subgraphs among G[V; U U], with U any subset of V — Vj of size at most 
log, |V]. 

— If |Vi| > |V|—log, |V|, Gi,...,G@z are k—7 densest distinct subgraphs among 
G[V, —U], with U a subset of Vigrn~, where Visrn © Vi consists of the log, |V| 
vertices of V; having smallest degree. 


We recall that if Phase 2 is never executed, Gy = G. We start by showing 
that Approx-Dens returns a feasible solution, that is a set of k subgraphs that 
covers G. 


Lemma 5. Approx-Dens returns a set of k subgraphs that cover G. 


Next we show that Approx-Dens achieves an approximation factor of 2 for 
the density profit. 


Lemma 6. Each subgraph G;, with 3 < i < k—1, computed by Phase 1 of 
Approx-Dens has density at least $dens(G4). 


Proof. Consider the subgraph G’ = G[V, U {u}]. By construction dens(G;) > 
dens(G’). Now, consider the density of G’, it holds that 


E| IFi| Vi 
dens(G’) > [Es = : 
() 2 Waa = Tals 


Since |Vi| > 2, aS > z, thus 


Fil [Val 2 
d G' > | > =d G}). 
ens(G") > Wilt = 3 ens(G1) 


It follows that dens(G;) > dens(G’) > 3dens(G;), thus concluding the proof. 


Lemma 7. Each G;, with 3 <i< k, computed by Phase 2 of Approx-Dens has 
density at least $dens(G,), with x € {1,2}. 


Proof. We consider the three cases of Phase 2. 


Case 1. |Vi| < 2log,|V| Since |V,| > 2, with « € {1,2}, it follows that a 
subgraph G; = (V;, £;) has the following density: 

|B; > |E| par [Ea [Var] > 2 [Ea 

[Vil ~ Wel +1 [Vel [Vel +1 ~ 3 [Vel 
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Case 2. 2log,|V| < |Vil < |V| — log, |V]|. Consider a set U; of vertices added to 
V,. Since |U;| < log |V| and |Vi| > 2 log, |V], it holds that 


[Ei [Fa _ |Fil [Vi 5 2 |i) 
[Vil ~ [Vil +logs|V] [Vil [Vil + logs |V| ~ 3 |Vi| 


Case 3. |Vi| > |V|—logs |V|. Consider a set Visrn © Vi, with |Vizrn| = log, |VI, 
a set of vertices having smallest degree in G,. Let U;, 3<171<k, be a subset of 
VMIN and G; = GV, = Uj]. 

Let d be the average degree of vertices in G1, then by removing U; from Gj, 
at most d|U;| edges are removed from EF. Thus: 


|E;| s |E,| — dlU;| = |E;|  dlU;| — |F4| ( log, |V| ) 
[Vi] ~ |Vi| Mil |Ml ~ [Val |V| — log, |V| 


For |V| larger than a constant (|V| > 37; notice that if |V| is a constant we can 


solve the problem by brute force), since cal > $d, it holds that 


a( log, |V| ) Z De LEA 
|V| — log, |V| 6 ~ 3 (Vi 


|Ei| 
IV; 


It follows that 
5 2 |Ei] 
~ 3 (Vi 


thus concluding the proof. 


Next, we show that Approx-Dens approximates the optimal density within a 
factor of 2. 


Lemma 8. The subgraphs G,,...,G, computed by Approx-Dens have density at 
least [OPT (Dens). 


Proof. Consider an optimal solution consisting of subgrpahs Gj,...,G;. Since 
G1, G2 are two densest subgraphs of G, it follows from Lemma 6 and Lemma 7 
that for i with 3 <i <k-—1, dens(G;) > 3dens(G%). 

Now, consider the subgraphs G1, Go, Gz. Since G,, G2 are two densest sub- 
graphs of G, it follows that 


2 
dens(G,) + dens(G2) > 3 (dens(G{) + dens(G3) + dens(G;)) 


thus 


2 
dens(G) + dens(G2) + dens(G,) > : (dens(G7) + dens(G5) + dens(G7)) 


and the lemma. holds. 
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Dist 


Dist starts with z = |V| subgraphs, each one containing a vertex of V and, while 
k < z, merges any two of these subgraphs. Let D,,..., Dx be the subgraphs 
returned by Dist. 


Lemma 9. D,,..., Dx have maximum profit distance. 


Proof. The lemma follows from the fact that D,,..., Dx are disjoint. 


Now, we show that we achieve an approximation factor of 3 for Top k-Cover- 


Densest Subgraphs. We denote by OPT the profit of an optimal solution S* of 
Top k-Cover-Densest Subgraphs, by OPT(Dens) (by OPT (Dist), respectively) 
the density profit (distance profit, respectively) of S*. 


Theorem 2. Let S = {G1,...Gx} be the solution returned by Approx-Dens and 
let D = {D,,... Dx} be the solution returned by Dist. Then max(p(S), p(D) > 
2 

=OPT. 


Proof. Recall that by Lemma 8, it holds that p(S) > 2OPT(Dens), and by 
Lemma 9 it holds that p(D) > AOPT (Dist). 
First, assume that AOPT (Dist) > 3 OPT(Dens). Then 


p(D) > \OPT (Dist) > =AOPT (Dist) + >\OPT (Dist) > 


of be 


2 2 
5 AOPT (Dist) + ;OPT(Dens), 


thus in this case Dist returns a solution having approximation factor 2, 
Assume that AOPT(Dist) < 3 OPT(Dens). Then 


2 2 4 
p(S) > 3 OPT (Dens) > 5OPT(Dens) + pp OPT Wens) > 


2 2 
pOPT(Dens) + pAOPT (Dist). 


thus in this case Approx-Dens returns a solution having approximation factor 2 


5? 
concluding the proof. 


5 Conclusions and Open Problems 


We have considered two variants of the problem of covering a graph with k 
densest subgraphs, where k > 2. For the first variant, we have shown that it 
is solvable in polynomial time, for any k > 2. For the second variant, which is 
NP-hard for k > 3, we have presented an approximation algorithm that achieves 
a factor of 2. 

There are some interesting open problems related to k-Densest Cover Sub- 
graphs and Top k-Cover-Densest Subgraphs. It would be nice to study whether 
the version k-Densest Cover Subgraphs that asks for & densest distinct subgraphs 
that cover G is polynomial time solvable. A positive answer will help also to 
improve the approximation of Top k-Cover-Densest Subgraphs. 
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Abstract. We present an algorithm for computing ¢-coresets for (k, £)- 
median clustering of polygonal curves in R? under the Fréchet distance. 
This type of clustering is an adaption of Euclidean k-median clustering: 
we are given a set of n polygonal curves in R“, each of complexity (num- 
ber of vertices) at most m, and want to compute k median curves such 
that the sum of distances from the given curves to their closest median 
curve is minimal. Additionally, we restrict the complexity of the median 
curves to be at most @ each, to suppress overfitting, a problem specific for 
sequential data. Our algorithm has running time linear in n, sub-quartic 
in m and quadratic in e«~'. With high probability it returns ¢-coresets of 
size quadratic in e~* and logarithmic in n and m. We achieve this result 
by applying the improved e-coreset framework by Langberg and Feldman 
to a generalized k-median problem over an arbitrary metric space. Later 
we combine this result with the recent result by Driemel et al. on the VC 
dimension of metric balls under the Fréchet distance. Furthermore, our 
framework yields e-coresets for any generalized k-median problem where 
the range space induced by the open metric balls of the underlying space 
has bounded VC dimension, which is of independent interest. Finally, we 
show that our e-coresets can be used to improve the running time of an 
existing approximation algorithm for (1, ¢)-median clustering. 


Keywords: Clustering - Coresets - Median - Polygonal curves 


1 Introduction 


At the present time even efficient approximation algorithms are often incapable 
of handling massive data sets, which have become common. Here, we need effi- 
cient methods to reduce data while (approximately) maintaining the core proper- 
ties of the data. A popular approach to this topic are ¢-coresets; see for example 
[14,27] for comprehensive surveys. An ¢-coreset is a small (weighted) set that 
aggregates certain properties of a given (massive) data set up to some small 
error. €-coresets are very popular in the field of clustering, cf. [11,19,20,22] and 
they are becoming a topic in other fields, too, cf. [16,28]. The technique for 
computing an ¢-coreset for a given data set highly depends on the application 
at hand, but mostly ¢-coresets are computed by filtering the given data set. 
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While ¢-coresets can be computed efficiently for k-clustering of points in the 
Euclidean space, less is known for clustering of curves. In particular, only little 
effort (cf. [9]) has been made in designing methods for computing ¢-coresets for 
(k, ¢)-clustering of polygonal curves in R@ under the Fréchet distance. This type 
of clustering, which has recently drawn increasing popularity due to a grow- 
ing number of applications, see [4,5,8,12], is an adaption of the Euclidean k- 
clustering: we are given a set of polygonal curves and seek to compute k& center 
curves that minimize either the maximum Fréchet distance (center objective), 
or the sum of Fréchet distances (median objective), among the given curves and 
their closest center curve. In addition, we restrict the complexity—the number of 
vertices—of each center curve to be at most £ to suppress overfitting, a problem 
specific for sequential data. This means that input curves and center curves are 
in general of different complexities, which is the reason why we need a specialized 
algorithm for computing ¢-coresets and can not apply «-coreset algorithms for 
discrete metric spaces (cf. [15]) on the input. 

The Fréchet distance is a natural dissimilarity measure for curves that is a 
pseudo-metric and can be computed efficiently [1]. Unlike other measures for 
curves, like the dynamic time warping distance (or the discrete version of the 
Fréchet distance), it takes the whole course of the curves into account, not only 
the pairwise distances among their vertices. This can be particularly useful, 
e.g. when the input consists of irregularly sampled trajectories, cf. [12]. Unfor- 
tunately, since the Fréchet distance is a bottleneck distance measure, i.e., it 
boils down to a single distance between two points on the curves, it is sensi- 
tive to outliers, which may negatively affect its applications. In clustering, we 
can counteract by choosing an appropriate clustering objective and indeed, the 
(k, £)-median objective is a good choice, because the median is a robust mea- 
sure of central tendency. However, the state of the art (k, @)-median clustering 
algorithms (cf. [8,12]) have exponential running time dependencies and cannot 
be used in practice, while the practical algorithms for (k, ¢)-clustering (cf. [4,5]) 
rely on the (k,@)-center objective, which is not robust and therefore amplifies 
the sensitivity on outliers. 

In this work, we present an algorithm for computing ¢-coresets for (k, ¢)- 
median clustering under the Fréchet distance and improve an (1, @)-median clus- 
tering algorithm by Buchin et al. [7], using ¢-coresets and rendering it much 
more practical. 


1.1 Related Work 


(k, £)-clustering of polygonal curves was introduced by Driemel et al. [12]. They 
developed the first approximation schemes for (k, ¢)-center and (k, ¢)-median 
clustering of polygonal curves in R, which run in near-linear time. They proved 
that both problems are NP-hard, when k is part of the input. Further, they 
showed that the doubling dimension of the space of polygonal curves under the 
Fréchet distance is unbounded, even when the curves are of bounded complex- 
ity. Subsequently, Buchin et al. [4] presented a constant factor approximation 
algorithm for (k,@)-center clustering of polygonal curves in R¢, with running 
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time linear in the number of given curves and polynomial in their maximum 
complexity. Also they showed that (k, ¢)-center clustering is NP-hard and NP- 
hard to approximate within a factor of (1.5 — ¢) for curves in R, respectively 
(2.25 — e) for curves in R¢ with d > 2, even for k = 1. Buchin et al. [5] pro- 
vided practical algorithms for (k,@) clustering under the Fréchet distance and 
thereby introduce a new technique, the so called Fréchet centering, for com- 
puting better cluster centers. Also, Meintrup et al. [26] provided a practical 
(1 + €)-approximation algorithm for discrete (k, 2)-median clustering under the 
presence of a certain number of outliers. Buchin et al. [6] proved that (k, ¢)- 
median clustering is also NP-hard, even for k = 1. Furthermore, they presented 
polynomial-time approximation schemes for (k, 2)-center and (k, 2)-median clus- 
tering of polygonal curves under the discrete Fréchet distance. Nath and Taylor 
[29] gave a near-linear time approximation scheme for (k, @)-median clustering 
of polygonal curves in R@ under the discrete Fréchet distance and a polynomial- 
time approximation scheme for k-median clustering of sets of points from R@ 
under the Hausdorff distance. Furthermore, they showed that k-median clus- 
tering of point sets under the Hausdorff distance is NP-hard (for constant k). 
Recently, Buchin et al. [8] developed an approximation scheme for (k, @)-median 
clustering under the Fréchet distance with running time linear in the number 
of curves and polynomial in their complexity, where the computed centers have 
complexity up to 2@ — 2. 

Langberg and Schulman [25] developed a framework for computing rela- 
tive error approximations of integrals over any function from a given family 
of unbounded and non-negative real functions. In particular, this framework can 
be used to compute e-coresets for k-clustering of points in R? with objective 
functions based on sums of distances among the points and their closest center. 
The idea of their framework is to sub-sample the input with respect to a certain 
non-uniform probability distribution, which is computed using an approximate 
solution to the problem. More precisely, the approximate solution is used to 
compute an upper bound on the sensitivity of each data element. The sensitiv- 
ity is the maximum fraction of cost that the element may cause for any possible 
solution. It is a notion of the data elements importance for the problem and the 
probability distribution is set up such that each element has probability propor- 
tional to its importance. A sample of a certain size drawn from this distribution 
and properly weighted, is an e-coreset for the underlying clustering problem with 
high probability. Feldman and Langberg [15] developed a unified framework for 
approximate clustering, which is largely based on ¢-coresets. They combine the 
techniques by Langberg and Schulman [25] with e-approximations, which stem 
from the framework of range spaces and VC dimension developed in statistical 
learning theory. As a result, they address a spectrum of clustering problems, such 
as k-median clustering, k-line median clustering, projective clustering and also 
other problems like subspace approximation. Braverman et al. [2] improved the 
aforementioned framework by switching to (¢,7)-approximations, which leads to 
substantially smaller sample sizes in many cases. Also, they simplified and further 
generalized the framework and applied it to k-means clustering of points in R?. 
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Following, Feldman et al. [16] improved this framework by switching to another 
range space, thereby obtaining smaller coresets for k-means, k-line means and 
affine subspace clustering. 


1.2 Our Contributions 


In this work we develop an algorithm for computing ¢-coresets for (k, ¢)-median 
clustering of polygonal curves in R@ under the Fréchet distance (where in the 
following we assume k, ¢ and d to be constant): 


Theorem 1. There exists an algorithm that, given a set of n polygonal curves 
in R® of complexity at most m each and a parameter € € (0,1), returns with 
constant positive probability an e-coreset for (k,£)-median clustering under the 
Fréchet distance of size O(k? log?(k)e~? log(m) log(kn)), in time 


O(nm log(m) + nm® log(m) + e~? log(m) log(n)) 
fork >1 and 

O(nmlog(m) +m? log(m) + e~* log(m) log(n)) 
fork=1. 


Also we show that ¢-coresets can be used to improve the running time of an 
existing (1, @)-median (5 + ¢)-approximation algorithm [7], thereby facilitating 
its application in practice. 

We start by defining generalized k-median clustering, where input and centers 
come from a subset (not necessarily the same) of an underlying metric space, 
each, and then derive our ¢-coreset result in this setting. This notion captures 
(k, 2)-median clustering under the Fréchet distance in particular, but the analysis 
holds for any metric space. In doing so, we first give a universal bound on the so 
called sensitivity of the elements of the given data set and their total sensitivity, 
i.e., the sum of their sensitivities. The sensitivities are a measure of the data 
elements importance, i.e., the maximum fraction of the cost an element might 
cause for any center set, and later they determine the sample probabilities. Our 
analysis is based on the analysis of Langberg and Schulman [25]. 

Next, we apply the improved e-coreset framework by Feldman and Langberg 
[15]. Here, our analysis is based on the analysis of Feldman et al. [16], but our 
sample size depends on the VC-dimension of the range space induced by the open 
metric balls. The open metric balls form a basis of the metric topology, hence it 
is more natural to study the VC dimension of their associated range space in a 
geometric setting. Indeed, for the ae spaces these range spaces have already been 
studied [17, Theorem 2.2] and recently, results for the (continuous and discrete) 
Fréchet, weak Fréchet and Hausdorff distance were obtained [13], enabling our 
main result. Finally, we show how an existing (1, @)-median (5+<¢)-approximation 
algorithm [7] can be improved by means of our €-coresets. 
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Theorem 2. There exists an algorithm that, given a set T of n polygonal curves 
of complexity at most m each, and a parameter € € (0,1/2], computes a polygonal 
curve c of complexity 20—2, such that with constant positive probability, it holds 
that 


cost (T, {c}) = dp(t,c) < (5+6e) S- dp(r,c*) = (5 + €)cost (T, {c*}), 


Ter TET 


where c* is an optimal (1, €)-median for T under the Fréchet distance. The algo- 
rithm has running time 


O (nmlog(m) + m? log(m) + m~1e~2/4+24~2 Jog? (m) log(n)) . 


Theorems | and 2 will follow from Theorems 7 and 9, respectively. Note that 
although we do not present algorithms for computing e-coresets for the weak 
and discrete Fréchet and Hausdorff distance, our results also imply the existence 
of e-coresets of similar size for these metrics. 


1.3. Organization 


In Sect.2 we give the results for general metric spaces: we derive a universal 
bound on the sensitivities in Sect.2.1 and the ¢-coreset result in Sect. 2.2. In 
Sect. 3 we present the algorithm for computing ¢-coresets for (k, )-median clus- 
tering. Finally, in Sect.4 we demonstrate the use of ¢-coresets in an existing 
(5 + €)-approximation algorithm for (1, @)-median clustering. All proofs can be 
found in the full paper [10]. 


2 Coresets for Generalized k-Median Clustering 
in Metric Spaces 


In this section, we first derive general results for e-coreset based on the sensitivity 
sampling framework [15,25]. In the following d € N is an arbitrary constant. By 
||-|| we denote the Euclidean norm and for n € N, we define [n] = {1,...,n}. For 
a closed logical formula Y we define by 1(W) the function that is 1 if W is true 
and 0 otherwise. 

Let XY = (X,p) be an arbitrary metric space, where X is any non-empty set 
and p: X x X — Rso is a distance function. We introduce a generalized definition 
of k-median clustering, where the input is restricted to come from a predefined 
subset Y C X and the medians are restricted to come from a predefined subset 
ZCX. 


Definition 1. The generalized k-median clustering problem is defined as fol- 
lows, where k € N is a fixed (constant) parameter of the problem: given a finite 
and non-empty set T = {T1,...,T} CY, compute a set C of k elements from 
Z, such that cost (T,C) = do ep Mincec p(T, €) is minimal. 
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We analyze the problem in terms of functions. This allows us to apply the 
improved ¢-coreset framework by Feldman and Langberg [15]. Therefore, given 
aset T = {71,...,T} CY we define F = {f1,..., f,} to be a set of functions 
with fi: 27 \ {0} — Rso, C+ minec p(c,7;). For each C € 27 \ {0} we now 
have cost (T,C) = 77, fi(C). 

In the following, we bound the sensitivity of each 7 € T. That is the maximum 
fraction of cost (T, C) that is caused by 7, for all C. To comply with the k-median 
problem we only take into account the k-subsets CC Z. 


2.1 Sensitivity Bound 


First, we formally define the sensitivities of the inputs 7 € T in terms of the 
respective functions. 


Definition 2 ([15]). Let F be a finite and non-empty set of functions f: 27 \ 
{0} — Rso. For f € F we define the sensitivity with respect to F: 
C 
(AF)= sup LO- 


C={e1 peeey ch}CZ > g(C) 
X 9(C)>0 ge F 
gCF 


We define the total sensitivity of F as G(F’) = di reps(f, F). 


We now prove a bound on the sensitivity of all f € F, which then yields a 
bound on the total sensitivity of F'. Later, our coreset will be a weighted sample 
from a distribution whose probabilities are determined by the derived bounds. 
To compute the bounds, any (bi-criteria) approximate solution to the generalized 
k-median problem can be used. Our analysis is an adaption of the analysis of 
the sensitivities for sum-based k-clustering of points in R? by Langberg and 
Schulman [25]. We note that similar bounds have already been derived in the 
literature, see e.g., [30]. 


Lemma 1. Let k’ EN, C* = {cf,...,%} C Z with A* = _, fi(C*) minimal 
and C = {é,...,&°} C X with A= 7, f(C) < a- A* for an a € [1, 00). 
Breaking ties arbitrarily, we assume that every T € T has a unique nearest 
neighbor in C and for i € [k'], we define V,; = {r € T | Vj € [k'] : p(t, &) < 
p(T, é;)} to be the Voronoi cell of & and A; = rev, A(T, ¢:) to be its cost. For 
each i € [k’] and 7; € V; it holds that 


\_ (12, (2H) (oe), 204i) f, | ‘ES 2 . 
i= (13 ze) ( A a (13 52 1A > s(fj, F) 


and P= yep (f) = 2k + 2V6ak! + 3a > G(F). 
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2.2  Coresets by Sensitivity Sampling 


We apply the framework of Feldman and Langberg [15]. First, we formally define 
e-coresets for generalized k-median clustering. 


Definition 3. Given e € (0,1) and a finite non-empty set T CY, a (multi-)set 
SC X together with a weight function w: S — Ryo is a weighted €-coreset for 
k-median clustering of T, if for all C C Z with |C| = k tt holds that 


(1 — e)cost (T,C) < costy (S,C) < (1+ )cost (T,C), 
where costy (S,C) = )) eg w(s) - mince p(s, ¢). 
We define range spaces and the associated concepts. 


Definition 4. A range space is a pair (X,R), where X is a set, called ground 
set and R is a set of subsets R C X, called ranges. 


The projection of a range space (X,R) onto a subset Y C X is the range 
space (Y,{Y 1 R| R € R}). Furthermore, for each range space there exists a 
complementary range space. 


Definition 5. Let F = (X,R) be a range space. We call F =(X,R), the range 
space over R={X\R|RER}, the complementary range space of F. 


A measure of the combinatorial complexity of a range space is the VC dimen- 
sion. 


Definition 6. The VC dimension of a range space (X,R) is the cardinality of 
a maximum cardinality subset Y C X, such that {YNR| RE R}| =2!"1. 


Note that F and F have equal VC dimension and for any Y C X, the 
projection of F onto Y has VC dimension at most the VC dimension of F’, see 
for example [18]. We define (¢, 7)-approximations of range spaces. 


Definition 7 ([21, Definition 2.3]). Let ¢,n € (0,1) and (X,R) be a range 
space with finite non-empty ground set. An (n,€)-approximation of (X,R) is a 
set S CX, such that for al RER 


IRAX|_|RAS|| e BOA if [RO X|>0-|X| 
S| | 


|X| E° 7, else. 


The following theorem is useful for obtaining (¢, 7))-approximations. 


Theorem 3 ((21, Theorem 2.11]). Let (X,R) be a range space with finite 
non-empty ground set and VC dimension D. Also, let ¢,6,n € (0,1). There is 
an absolute constant c € Ryo such that a sample of 


pee) a) 


elements drawn independently and uniformly at random with replacement from 
X is a (n,€)-approximation for (X,R) with probability at least 1 — 6. 
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We define open metric balls, which are the ranges used to derive our result. 


Definition 8. Forr € Rso, z € Z and Y C X we denote by B(z,r,Y) = {y € 
Y | p(y, z) < r} the open metric ball with center z and radius r. We denote the 
set of all open metric balls by B(Y, Z) = {B(z,r,Y) | z € Z,r € Rso}. 


Now, we are ready to analyze the computation of the actual ¢-coresets. We 
use the reduction to uniform sampling, introduced by Feldman and Langberg [15] 
and improved by Braverman et al. [2] (using Theorem 3). Preferably we would 
apply Theorem 31 by Feldman et al. [16], which however is not possible since it 
depends on a range space where each function f € F may be assigned a distinct 
scaling factor. This is incompatible with the range space induced by the open 
metric balls we use to obtain our result. However, by adapting and modifying 
the proof of their theorem we can derive the desired and more versatile result. 
To handle necessary scaling factors still involved in the analysis, we incorporate 
results by Munteanu et al. [28] for bounding the VC dimension. The proof can 
be found in the full paper [10]. 


Theorem 4. For f € F we let (f) = [|F|-2!820)1] /|F|, A= rer MS), 
w(f) = Ath) and D be the VC dimension of the range space (Y,B(Y, Z)). Let 
6,€ € (0,1). A sample S of O (e~?ak! (Dk log(k) log(ak’n) log(ak’) + log(1/5))) 
elements 7; from T, drawn independently with replacement with probability w(fi) 
and weighted by w(fi) = TSENEDI is an €-coreset with probability at least 1— 6. 


3  Coresets for (k,@)-Median Clustering Under 
the Fréchet Distance 


Now we present an algorithm for computing ¢-coresets for (k, ¢)-median cluster- 
ing of polygonal curves under the Fréchet distance. We start by defining polyg- 
onal curves. 


Definition 9. A (parameterized) curve is a continuous mapping T: [0,1] — R¢. 
A curve T is polygonal, iff there exist v1,...,Um € R4, no three consecutive on 
a line, called r’s vertices, and ty,...,tm € [0,1] with th < +++ <tm, ti = 0 and 
tm = 1, called T’s instants, such that T connects every two consecutive vertices 
vu; = T(ti), Viga = T(ti41) by a line segment. 


We call the segments 0709,...,Um—1Um edges of 7 and m the complexity of 7, 
denoted by |r]. 


Definition 10. Let H denote the set of all continuous bijections h: [0,1] — [0,1] 
with h(0) = 0 and h(1) = 1, which we call reparameterizations. The Fréchet 
distance between curves o and T is dp(o,T) = infnen Maxzejo,1) ||o(t)—T(A(t))||- 


Now we introduce the classes of curves we are interested in. 
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Definition 11. For d € N, we define by X% the set of equivalence classes of 
polygonal curves (where two curves are equivalent, iff they can be made identical 
by a reparameterization) in ambient space R¢. Form € N we define by X¢, the 
subclass of polygonal curves of complexity at most m. 


Finally, we define the (k, ¢)-median clustering problem for polygonal curves. 


Definition 12. The (k,¢)-median clustering problem is defined as follows, 
where k,€ € N are fixed (constant) parameters of the problem: given a set 
T Cc X¢, of n polygonal curves, compute a set of k curves C* C X¢%, such that 
cost (T,C*) = S> min dpr(r,c*) ts minimal. 
rer eC" 
We bound the VC dimension of metric balls under the Fréchet distance by 
showing that a result of Driemel et al. [13] holds also in our setting. 


Theorem 5. The VC dimension of (X¢,,B(X4, X#)) is O ( log(¢m)). 


Proof. We argue that the claim follows from Theorem 18 by Driemel et al. [13]. 
First, in their paper polygonal curves do not need to adhere the restriction 
that no three consecutive vertices may be collinear and they define X¢, to be 
the polygonal curves of exactly m vertices. However, our definitions match by 
simulating the addition of collinear vertices to those curves in X¢, with less than 
m vertices. 

Now, looking into their proof, we can slightly modify the geometric primitives 
by letting B,(p) = {z € R®| ||z—pl| < r}, D, (st) = {a € R*| dp € st: ||p—al| < 
r}, C,(st) = {x € R® | Jp € (st) : |p — 2|| < r} and R,(st) = {p+ ul pe 
st,u € R¢, (t— s,u) = 0,||ul] < r}, which does not affect the remainder of the 
proof and thus yields the same bound on the VC dimension. 


To compute e-coresets for (k, £)-median clustering under the Fréchet distance, 
we first need to compute the sensitivities and to do so, we utilize constant factor 
approximation algorithms. We use [8, Algorithm 1], which only works for k = 1 
but is very efficient in this case. For k > 1 we use Algorithm 1, a modification 
of [12, Algorithm 3], which we now present. This algorithm uses (approximate) 
minimum-error ¢-simplifications, which we now define. 


Definition 13. An a-approzimate minimum-error l-simplification of a polygo- 
nal curve tT € X¢ is a curve o € X@ with dr(t,0) < a-dp(t,0’) for allo’ € X¢. 


The following lemma is useful to obtain simplifications. 


Lemma 2 ([4, Lemma 7.1]). Given a curve o € X‘%,, a 4-approximate 


minimum-error ¢-simplification can be computed in O(m?logm) time, by com- 
bining the algorithms by Alt and Godau [1] and Imai and Iri [23]. 


We now present the constant factor approximation algorithm. 
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Algorithm 1. Constant Factor Approximation for (k, 2)-Median Clustering 
1: procedure (k, £)-MEDIAN-96-APPROXIMATION(T = {71,...,T}) 

2: for i=1,...,n do 

3: 7, <— approximate minimum-error ¢-simplification of 7; 

4: C <— Chen’s algorithm with ¢ = 0.5,A = 6 on {71,...,7n} [11, Theorem 6.2] 
i) return C’ 


We prove the correctness and analyze the running time of Algorithm 1. 


Theorem 6. Given 6 € (0,1) and T = {11,...,T} C X%,, Algorithm 1 returns 
with probability at least 1 — 6 a 109-approzimate (k, £)-median solution for T in 
time O(nmlog(1/6) log(m) + nm? log(m)). 


The proof can be found in the full paper [10]. We now present the algorithm 
for computing weighted ¢-coresets for (k, ¢)-median clustering. 


Algorithm 2. Coresets for (k, @)-Median Clustering 


1: procedure (k, 2)-MEDIAN-CoRESET(T = {71,..-,7},6,€) 
2: if k =1 then 


3 é<— ¢Median-34-Approximation(T, 6/2) [8, Algorithm 1] 

4 C = {é} 

5: else 

6: C ={é&,...,¢n} — Algorithm 1(T, 6/2) 

7 compute Vi,...,Ve, Ai,..., Ax and y w.r.t. Cc (cf. Lemma 1) 

8 compute A, A w.r.t. y and ~ w.r.t. A (cf. Theorem 4) 

9 S — sample O(ke~?(d?0?k log(dém) log(kn) log?(k) + log(1/(26)))) 


elements from T independently with replacement with respect to w 
10: compute w w.r.t. A, A and S (cf. Theorem 4) 
11: return S and w 


We prove the correctness and analyze the running time of Algorithm 2. Also, 
we analyze the size of the resulting e-coreset. 


Theorem 7. Given a set T = {11,...,7} C X4%, and 6,e € (0,1), Algorithm 2 
computes a weighted e-coreset of size O(e~?(log(m) log(n) +log(1/6))) for (k, 2)- 
median clustering with probability at least 1 — 6, in time 


O(nmlog(m) log(1/5) + nm? log(m) + e~?(log(m) log(n) + log(1/6))) 
fork >1 and 
O(nmlog(m) +m? log(m) log? (1/5) +m? log(m) +e? (log(m) log(n) +log(1/6))) 
fork=1. 


We prove this theorem (in the full paper [10]) by combining Lemma 1, 
Theorem 4, and Theorem 6, respectively [8, Corollary 3.1]. 
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4 ‘Towards Practical (1, 2)-Median Approximation 
Algorithms 


In this section, we present a modification of Algorithm 3 from [7]. Our modifi- 
cation uses e-coresets to improve the running time of the algorithm, rendering 
it more tractable in a big data setting. We start by giving some definitions. For 
p € R@ and r € Ryo we denote by B(p,r) = {q € R® | |lp — q|| < r} the closed 
Euclidean ball of radius r with center p. We give a standard definition of grids. 


Definition 14 (grid). Given a number r € Ryo, for (p1,...,pa) € R? we 
define by G(p,r) = (|pi/r|-7,.--;|pa/r| +r) the r-grid-point of p. Let P C R¢4 
be a subset of R¢. The grid of cell width r that covers P is the set G(P,r) = 
{G(p,r) | p € P}. 


Such a grid partitions the set P into cubic regions and for each r € Ryo and 
p € P we have that ||p — G(p,r)|| < Vdr. The following theorem by Indyk [24] 
is useful for evaluating the cost of a curve at hand. 


Theorem 8 (([24, Theorem 31]). Let ¢ € (0,1] and T Cc X¢ be a set of polyg- 
onal curves. Further let W be a non-empty sample, drawn uniformly and inde- 
pendently at random from T, with replacement. For t,0 € T with cost (T,T) > 
(1+ )cost (T, 0) it holds that Pr[cost (W,r) < cost (W,a)] < exp (—e?|W|/64). 


The following theorem, which we combine with fine-tuned grids, allows us to 
obtain low-complexity center curves. 


Lemma 3 ([7, Lemma 4.1]). Let 0,7 € X®% be polygonal curves. Let 
Vly + 25 UZ be the vertices of t and let r = dpr(o,r). There exists a 


polygonal curve o’ € X®% with every vertex contained in at least one of 
B(vj, 7), te BT), dp(o"',T) S< dr(o, T) and lo" | < 2\o| —2. 


Finally, we present our improved modification of Algorithm 3 from [7]. This 
algorithm uses e-coresets every time it has to evaluate the cost of a center set. 
The dramatic effect of this small modification is that we nearly lose the original 
linear running time dependency on n in the most time consuming part of the 
algorithm, rendering it practical in the setting where we have a lot of curves of 
much smaller complexity than number (¢<m <n). 
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Algorithm 3. (1, 2)-Median by Simple Shortcutting and ¢-Coreset 


1: procedure (1, ¢)-MEDIAN-(5 + ¢)-APPROXIMATION(T = {71,..-,7},6,€) 
2: é< (1, £)-Median-34-Approximation(T, 6/4) [8, Algorithm 1] 
e’ — €/67,P —O 
(T’,w) — (1, 2€ — 2)-Median-Coreset(T, 6/4, €’) 
A < costw(T", {é}), Au — A/(1—’), Ar — A/((1 + €’)34) 
S — sample [—2(e’)~'(In(5) — In(4))] curves from T uniformly 
and independently with replacement 
ie W — sample [—64(e’)~?(In(6) — In([—8(e’)~*(In(5) — In(4))]))] curves 
from T uniformly and independently with replacement 


8: c — arbitrary element from arg min,<, cost (W, s) 

9: for i=1,...,|c| do 

10: P — PUG(B (vf, (3+ 4e’)Au/n) ,e’Ai/(nVd)) (vf: i” vertex of c) 
11: C < set of all polygonal curves with 2¢— 2 vertices from P 

12: return arg min,¢¢ costw(T", {c’}) 


We show the correctness and analyze the running time of Algorithm 3. 


Theorem 9. Given two parameters 6 € (0,1), € € (0,1/2] and a set T = 
{71,...,T} C X4, of polygonal curves, with probability at least 1—5 Algorithm 3 
returns a (5 + €)-approximate (1, )-median for T with 2 —2 vertices, in time 


O (nn log(m) + m? log(m) log? (1/6) + m7*~? log(m) teat) tos (/0)) we) . 


We prove Theorem 9 (in the full paper [10]) by modifying the proof of |7, 
Theorem 5.1]. 


5 Conclusion 


We presented an algorithm for computing e-coresets for (k, ¢)-median cluster- 
ing of polygonal curves under the Fréchet distance and used these to improve 
the running time of an existing approximation algorithm for (1, ¢)-median clus- 
tering. Unfortunately, it was not possible to improve the existing (k, @)-median 
approximation algorithms in [8,12] by means of ¢-coresets. This is due to the 
recursive approximation scheme used in these works, where the candidate center 
sets are not necessarily evaluated against the input, but against subsets of the 
input. Thus, we would need an ¢€-coreset for any subset of the input, which is not 
practical. We note that, to the best of our knowledge, no (k, £)-median clustering 
algorithm exists that do not employ this approximation scheme. 

It is still an interesting open problem whether there exist sublinear size ¢- 
coresets for weighted sets of polygonal curves. To derive such a result one may 
need a sublinear bound on the VC dimension of the range space of metric balls 
under scaled Fréchet distances, which is not evident at the moment. We note 
that such a result would enable the use of the iterative size reduction technique 
recently introduced by Braverman et al. [3]. 
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Abstract. The paper is devoted to the study of geodetic convex hulls 
in graphs from a theoretical and practical perspective. The notion of 
convexity can be transferred from continuous geometry to discrete graph 
structures by defining a node subset to be (geodetically) convex if all 
shortest paths between its members do not leave the subset. The geodetic 
convex hull of a node set W is the smallest convex superset of W. The 
hull number of a graph is then defined as the size of the smallest node 
subset S' (called hull set) whose convex hull contains all graph nodes. In 
contrast to the geometric setting, where the point subset on the boundary 
of the convex hull can be computed in polynomial time, it is NP-hard 
to decide whether a graph has a hull set of size at most s € N. We 
establish novel theoretical bounds for graph parameters related to convex 
graph structures, and also design practical algorithms for upper and 
lower bounding the hull number. We evaluate the quality of our bounds 
as well as the performance of the proposed algorithms on road networks 
and wireless sensor networks of varying size. 


Keywords: Hull number - Graph contour - Geodetic iteration number 


1 Introduction 


There exist different notions of a hull of a graph. For embedded graphs, the 
convex hull of the nodes [12] or the polygonal hull of the edges [14] are utilized 
for applications such as clustering, shape recognition, or area-of-interest visual- 
ization. In this paper, we focus on geodetic convex hulls, though, which do not 
depend on any embedding but are based solely on the toplogy of the graph [13]. 
Here, given a connected, undirected and unweighted graph G(V, E), we denote 
with I(a,b) the set of nodes on shortest paths between a € V and b € V. This 
set is also called the interval of a and b. The interval of a node set S C V is the 
union of the intervals of all pairs of nodes a,b € S, that is, I[S] := Uses I(a, b). 
A set S' is called convex if I[S] = S. This implies that all shortest paths between 
nodes in S$ are fully contained in S. The (geodetic) convex hull h(W) of a node 
set W C V is then defined as the smallest convex superset S of W. This notion 
is inspired by the geometric concept of convexity, where a point set is convex if 
and only if it contains all straight line segments between its members. 
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Similar to the geometric setting, a geodetic convex hull may be represented by 
its extreme points. For a convex point set in the plane, an extreme point is a point 
that does not lie on any open line segment between two other points in the set. 
For a convex node set, a node is extreme if it is not contained as an inner node ina 
shortest path between other extreme nodes. More formally, for a convex node set S$ 
in a graph, we call a node set W a hull set of S if I*[W] = S for some k € N, where 
I*[W] = I[W] for k = 1 and I[I*~1[W]] for k > 1. The other way around, for any 
set W, its convex hull h(W) can be computed by applying this iterated interval 
operation until [*+'[W] = I*[W] and hence I*[W] = h(W) holds. The respec- 
tive value of k is called the geodetic iteration number of W, denoted by gin(W). 
Figure | illustrates the concept of iterated interval computation. 


Fig. 1. Cutout of a road network with the geodetic convex hull of a set W of three 
nodes (red). Each color encodes an iteration step in which nodes on shortest paths 
between nodes in the current set W are added to W. The final set (obtained after 
gin(W) = 7 iterations) contains 7886 nodes. (Color figure online) 


The size of a smallest set S such that I*[S] = V for some k € N is called the 
hull number of the graph, abbreviated as hn(G). For geometric convex hulls as 
well as polygonal hulls, the set of extreme points or nodes on the boundary can 
be computed in polynomial time. 

Unfortunately, deciding whether hn(G) < s for s € N poses an NP-hard deci- 
sion problem, even in several restricted graph classes as e.g. bipartite graphs [1]. 
Nevertheless, geodetic convex hulls and convex node sets have important appli- 
cations in data structure design [11], connectivity analysis [15], route planning 
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[20], and graph similarity assessment [6]. Moreover, the hull number and related 
parameters are relevant for theoretical investigations as e.g. the characteriza- 
tion of classes of chordal graphs [8] or parameterizability of certain optimization 
problems [16]. 

The goal of this paper is to establish new (theoretical) bounds for the geode- 
tic iteration number as well as the hull number, and to enable the efficient 
computation of concise hull sets in practice. 


1.1 Related Work 


The computation of boundary nodes of a graph is relevant for a wide range 
of applications. One particular well-studied application field is the analysis of 
wireless sensor networks. Here nodes represent sensors which are able to detect 
signals and communicate with other sensors in their proximity. These connec- 
tions are then modelled as undirected egdes. The goal is typically to compute the 
boundary of the network which inlcudes the detection of so called holes (areas 
that are not well covered by sensors). A recent survey [5] discusses a rich set of 
approaches for boundary detection. Most of these papers are focused on algo- 
rithms that work well in practice, though, and rarely provide clear definitions 
of boundary nodes or holes (or only ones that depend on selected thresholds). 
While there are some approaches which come with quality guarantees [10,19], 
those usually strongly depend on certain model assumptions. 

Convexity structures and hull sets are well-defined in any graph and are 
fundamental concepts in graph theory [7]. The hull number (the size of the 
smallest hull set of a graph) turned out to be NP-hard to compute even in 
bipartite graphs [1] and chordal graphs [2], though. 

The boundary of the graph 6(G) and the contour of the graph Ct(G) (see 
Sect. 2 for definitions) were both shown to constitute a (not necessarily minium 
sized) hull set [4,17]. But they come with different geodetic iteration num- 
bers. While gin(6(G)) = 1 in all graphs, gin(Ct(G)) = 1 only holds in certain 
graph classes (as e.g. chordal graphs) but examples with up to gin(Ct(G)) = 3 
are known in general graphs [17]. It is an open question whether graphs with 
gin(Ct(G)) > 3 exist. Further theoretical results and related notions of convexity 
are discussed in a survey by Brevsat et al. [3] and a book by Pelayo [18]. 


1.2 Contribution 


The following theoretical and practical results for hull computation in graphs 
are presented in this paper: 


— We first study the geodetic iteration number of a special node set, called the 
graph contour Ct(G). It is known that the contour is always a hull set of the 
graph, but it is also relevant to identify the value of k such that I*[Ct(G)] = 
V, that is, to compute gin(Ct(G)). We prove several new upper bounds for 
k that relate the geodetic iteration number to the graph diameter as well as 
the contour size. 


184 S. Storandt 


— For the hull number hn(G), we provide novel lower and upper bounding tech- 
niques based on structural insights. The upper bound is based on a heuristic 
that computes a feasible hull set (with bounded gin) in linear time. 

— In the experimental evaluation, we demonstrate the quality of our bounds 
as well as the applicability of our novel algorithms on road networks and 
wireless sensor networks. As one important result, we show that our heuristic 
for computing a hull set is fast in practice and produces solutions that are 
significantly smaller than the graph contour. 


2 Preliminaries 


Throughout the paper, we assume to be given a connected, undirected and 
unweighted graph G(V,E) with |V| = n nodes and |E| = m edges. With 
d(v,w) we denote the minimum hop distance between nodes v € V and w € V. 
For each node vu € V, the eccentricity ecc(v) is defined as the length of the 
longest shortest path emerging from v. More formally, ecc : V — N with 
ecc(v) = maxXwey d(v, w). Based on this notion, we define the following graph 
parameters and special node sets (see Fig. 2 for illustrations): 


Fig. 2. Small graph with eccentricity values depicted in purple. The diameter is 5 and 
the radius is 3. The contour is formed by the node set {a,b,c}, the boundary by all 
nodes except of v. Note that this is an example of gin(Ct(G)) = 2 as w ¢ I[Ct(G)]. 
The path wu, v,a is a trail in G. (Color figure online) 


— The diameter diam(G) of a graph is the largest shortest path distance in G, 
that is, diam(G) := maxyey ecc(v). 

— The radius rad(G) of a graph is the smallest distance value such that some 
node v can reach all other nodes in G with a shortest path of at most that 
length, hence rad(G) := minyey ecc(v). 

— The contour Ct(G) is the set of nodes with maximum eccentricity among 
their neighbors: Ct(G) := {vu € V|Vw € N(v) : ecc(v) > ecc(w)} where 
N(v) := {w € V\{v, w} € EF}. 

— A trail is a path p = vj, v2,...,4 in G with ecc(vj41) = ecc(v;) + 1 for 
i=1,...,1—1. The nodes v2,...,v are then said to be on a trail from v; 
and belong to the trail set tr(v1). 
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For disambiguation, we further include the definition of the boundary 6(G) of a 
graph: 6(G) := {v € VIdue V: Vw € N(v) : d(u,w) < d(u,v)}. The boundary 
is always a hull set of G with gin(d(G)) = 1 and a superset of the contour [4]. 


3 Bounds for the Gin of the Graph Contour 


The contour Ct(G) of a graph G always yields a hull set. The geodetic itera- 
tion number of the contour gin(Ct(G)) denotes the smallest value k such that 
I¥|Ct(G)] = V. It was conjectured for over a decade that gin(Ct(G)) < 2 holds 
in all graphs. But a counter-example with gin(Ct(G)) = 3 was provided in [17]. 
It still is an open question whether gin(Ct(G)) is a always a small constant. A 
trivial upper bound is gin(Ct(G)) <n as in each iteration at least one node 
has to be added to the set. In the following, we prove several non-trivial upper 
bounds for gin(Ct(G)) by establishing connections to the graph diameter and 
the contour size. 


Theorem 1. For any graph G, the geodetic iteration number of the contour 
Ct(G) is upper bounded by diam(G) — rad(G). 


Proof. We prove the following statement by induction over k: 
I(Ct(G)] D {vu € Vlecc(v) > diam(G) — k} 


Hence after at most k = diam(G) — rad(G) iterations, all nodes v € V are con- 
tained in I*([Ct(G)]. For k = 0, it is required that all nodes with an eccentricity 
equal to the diameter of the graph are contained in the contour. This is true by 
definition of the contour (as ecc(v) < diam(G) for all v € V). Next, we consider 
IF+1(Ct(G)] and a node v ¢ I*[Ct(G)] with ecc(v) = diam(G) — (k +1). As 
from v ¢ I*[Ct(G)] it follows v ¢ Ct(G), there has to exist a neighboring node 
w € N(v) with ecc(w) = ecc(v) + 1. Let z be a node with d(w,z) = ecc(w). 
Then both, w and z, have a higher eccentricity than v and are hence contained 
in I*[Ct(G)] according to the induction hypothesis. As ecc(v) = ecc(w) —1, it fol- 
lows that d(v, z) < ecc(v) = ecc(w) — 1 and therefore d(w, v) + d(v, z) < d(w, z). 
Obviously, the inequality has to be tight. It follows that v is on a shortest path 
from w to z; and with w, z € I*[Ct(G)], we hence conclude that v € I**1[Ct(G)]. 


Note that this bound is tight for the example graph in Fig. 2, as there we have 
diam(G) — rad(G) = 2 = gin(Ct(G)). 


Corollary 1. Theorem 1 in combination with the simple fact that rad(v) > 
diam(G)/2 yields gin(Ct(G)) < diam(G)/2. 


We next want to prove connections between gin(Ct(G)) and the eccentricity of 
the contour nodes as well as the size of the contour. Our proofs are based on a 
lemma from Mezzini [17] which is rephrased below. 
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Lemma 1. /f I?/Ct(G)] 4 V, then there exist nodes a1,...,a¢ such that 


a, €V \ I?[(Ct(G)| — as ¢ Ct(G), d(a4,a5) = ecc(a4) 


~ ‘ d CHG, dase) = ecc(a2) ~ ag € tr(a5) N CLG) 


— a4 € tr(a3) MN Ct(G) — ecc(ag) > ecc(a,) + 3 


a4 


Fig. 3. Schematic depiction of the relationship between the nodes described in 
Lemma 1. The turquoise cycle illustrates the graph contour with nodes a2,a4 and 
ag being part of the contour. Straight lines indicate shortest paths between nodes, and 
path sections with arrows correspond to trails on which the eccentricity of the nodes 
increases by one along each directed edge. 


An illustration of the configuration described in Lemma 1 is provided in Fig. 3. 
We will next prove a generalization of this lemma. 


Theorem 2. For any node a, ¢ I*[Ct(G)| for k > 2, nodes az,...,a¢ with 
properties as described in Lemma 1 exist. 


Proof. Let a, ¢ I*([Ct(G)] for ak > 2. Accordingly, a1 ¢ Ct(G) and hence a, has 
a neighboring node with eccentricity ecc(a,) +1. As this applies to all nodes that 
are not part of the contour, there exists a path from any node v € V \ Ct(G) to 
a contour node such that the eccentricity of the nodes along the path increases 
by one in each step. Therefore, ag € tr(a1)M Ct(G) has to exist. Of course, 
there needs to be a node a3 such that the eccentricity of a2 is realized, that 
is, d(a2,a3) = ecc(a2). Now, we further conclude that there exists a shortest 
path from az to a3 that traverses a,, as we know that d(a,,a3) < ecc(a,) and 
ecc(a1) = ecc(az) — d(a2,a,) based on ag € tr(a;) and therefore d(az,a1) + 
d(a,,a3) < ecc(az). It follows that az cannot be part of the contour, as otherwise 
a, would be included in I[Ct(G)]. The node a4 exists for the same reasons as 
az, now based on a3 ¢ Ct(G). For node as, we repeat the argument used to 
show the existence of a3 but now a; € Ct(G) would lead to a, € I?[Ct(G)] 
which still is a contradiction to the choice of aj. The node ag then exists for 
the same reasons as a2 and a4 based on the observation that as ¢ Ct(G). It 
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remains to show that the nodes are all distinct and that ecc(ag) > ecc(a1) + 3. 
As a, ¢ Ct(G) and ag € tr(a,) N Ct(G), we clearly have ecc(az) > ecc(ai) > 0 
and with that also ecc(a2) > 2. The latter excludes a3 = az and a3 = a,. The 
node a4 then need to have a higher eccentricity than all nodes with smaller index 
and is hence distinct from all of them. Analogue arguments apply to a5 and ag 
and we conclude ecc(ag) > ecc(as) > ecc(aa) > ecc(a3) > ecc(az) > ecc(a1) > 0. 
These inequalities imply that ecc(ag) > ecc(a1) + 3. 


Observation 1. Jt follows directly from the proof of Theorem 2 that as € 
I*(Ct(G)] implies a, € I**?[Ct(G)], because az,a4 € Ct(G), a3 is on a shortest 
path between a4 and as, and a, is on a shortest path between ag and a3. 


Let now €(G) := min,ecrq@ ecc(v) be the minimum eccentricity of a contour 
node in G. Then the following relationship holds. 


Theorem 3. For any graph G, the geodetic iteration number of the contour 
Ct(G) is upper bounded by diam(G) — €(G) +1. 


Proof. In the proof of Theorem 2, we observed that ecc(as) > ecc(az). With az € 
Ct(G), we have ecc(az2) > €(G). Based on Theorem 1, we can now conclude that 
as € I*(Ct(G)] with diam(G)—k = ecc(az)+1. Combined with Observation 1, we 
have a; € I**?(Ct(G)| with k+2 = diam(G)—ecc(a2)+1 < diam(G)—£€(G) +1. 


Again, the graph in Fig.2 is a tight example, as there we have diam(G) — 
€(G)+1=5-—441 = 2 = gin(Ct(G)). But Theorem 3 provides a stronger 
upper bound than Theorem 1 whenever the minimum eccentricity of the contour 
nodes is larger than the minimum eccentricity of all nodes plus one. 


Theorem 4. For any graph G, the geodetic iteration number of the contour 
Ct(G) is upper bounded by |Ct(G)]. 


Proof. If there are only two contour nodes, the lemma is trivially true. Now let 
C1,-+-,C€s with s = |Ct(G)| > 3 be the contour nodes, sorted increasingly by 
eccentricity. Hence we have ecc(cs_1) = ecc(cs) = diam(G). We define the node 
sets A; fori = 1,...,s —las A; := {v © Vlecc(c;) < ecc(v) < ecc(ci41)} as 
well as Ag := {vu € Vlecc(v) < ecc(c,)} and A, := {uv € Vlecc(v) > ecc(cs)}. 
We observe that all nodes in A; for i > s — 3 are contained in [?[Ct(G)] as for 
a, € A; there do not exist three contour nodes with higher and pairwise different 
eccentricity that can take the roles of az,a4 and ag described in Lemma 1. Now 
consider a, € A; for any i < s — 3. Then the smallest possible indices of the 
contour nodes a2,a@4,ag in the sorted order are i + 1,7 + 2,7 + 3. Based on 
ecc(as) > ecc(aa), it follows that as has to be contained in set A; for some j > 
i+2. Together with Observation 1, that tells us that a, is added to the iterated 
contour set at most two rounds after a5, we conclude that the nodes in any set 
Aj>o are added after at most s iterations and hence gin(Ct(G)) < |Ct(G)|. 


4 Bounding the Hull Number 


In this section, we investigate upper and lower bounds for the hull number of a 
graph with a focus on bounds that can be efficiently computed in practice. 
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Upper Bounds. Simple theoretical upper bounds for the hull number as hn(G) < 
n — diam(G) + 1 were discussed in [7]. The size of the graph contour also con- 
stitutes a valid upper bound (with unclear a priori gin(Gt(G)) value), and so 
does the size of the boundary (with gin(6(G)) = 1). But the computation of the 
contour as well as the boundary requires the knowledge of all pairwise distances 
between graph nodes and therefore takes time O(n? -+nm) which is not practical 
for large input networks. However, we can also compute an upper bound with 
as single BFS run in O(n + m) as detailed out in the following observation. 


Observation 2. For a given graph G(V, EF) and any node v € V the set S = 
vUL(v), where L(v) denotes the set of leaves in a BFS-tree rooted at v, is a hull 
set with gin(S) =1. 


U 


Ss 


Fig. 4. Example of BFS based hull set computation. Left image: The leaf nodes in 
BFS tree from v (thick green edges) form together with v a hull set S (red nodes) with 
gin(S') = 1. Middle image: In the BFS tree from s there are two paths which contain 
three current hull nodes. Hence the middle nodes (purple) can be pruned. Right image: 
Valid reduced hull set (red nodes) S with gin(S) < 2. (Color figure online) 


This simple observation also reveals a connection between hn(G) and the max- 
imum leaf number ml(G), the largest number of leaves in any spanning tree of 
G, which is used in FPT algorithm design of e.g. coloring problems [9]. 


Corollary 2. hn(G) < ml(G) +1 


The bound is tight for complete graphs K,, with n > 2 where hn(G) = n and 
mU(G)=n-1. 

The BFS bound from Observation 2 may be further improved in practice 
by iterating the following process: Select any node s € S and compute the 
respective BFS tree. If in an interval I(s,s’) with s’ € S there are other nodes 
from S, those can be pruned from S$. Figure4 shows a successful example of 
this pruning strategy (with paths instead of intervals for clarity). We note, that 
with every iteration of BFS based pruning the geodetic iteration number might 
increase by at most 1. Hence, using B BFS runs in total, we get a running time 
of O(B(n + m)) and end up with a valid hull set $ with gin(S) < B. 


Lower Bounds. Clearly, for any graph with more than one node, hn(G) > 2 has 
to hold, as there need to be at least two nodes in a hull set S to ensure |J[S]| > 1. 
In [1], it was proven that the hull number of a graph can be computed based on 
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considering its two-connected components splitted at their cut nodes. To improve 
on that, we will now consider general node subsets W Cc V together with their 
set c(W) of cut nodes (the subset of W with a neighbor outside of W) and show 
that under certain conditions, W has to contain a node from the hull set. 


Lemma 2. If W \ h(c(W)) 4 0 for a node subset W of graph G, then every 
hull set of G needs to contain at least one node from W. 


Proof. Assume for contradiction that S is a hull set, SQ W = @ and W \ 
h(c(W)) 4 @. Based on S being a hull set, we know that h(S) > W. Obviously, 
shortest paths between nodes a,b € V \ W that intersect W have to enter and 
exit W via nodes in c(W), and their intersections with W have to be shortest 
paths as well. As SN W = 9, it follows that h(S) NW C h(c(W)). But as we 
know that W\h(c(W)) 4 @ this poses a contradiction to h(S) > W. Accordingly, 
there has to exist a hull set node in SN W. 


Fig. 5. Node subset W (indicated by the turquoise box) and its cut nodes c(W) (large 
black dots at the border of the box). The shortest paths between the cut nodes all 
have length 2 and are drawn in green. The nodes on these paths belong to h(c(W)). 
However, the red nodes are not on any shortest path between nodes in h(c(W)). Hence, 
a valid hull set has to contain at least one of them. (Color figure online) 


Figure 5 provides a small example instance for which the condition specified in 
Lemma 2 is met. Based on this lemma, a lower bound for hn(G) can be obtained 
by first selecting a set of pairwise intersection free induced subgraphs W,,...,W1 
and then counting the W; for which the lemma applies. 


5 Experimental Results 


Algorithms were implemented in C++. Experiments were conducted on a single 
core of an Intel(R) i5-8250U CPU clocked at @1.60GHz with 32 GB of RAM. 
We use two different graph types in the experiments: real-world road networks 
and simulated sensor networks. 


Road Networks. The road networks are connected subgraphs of the OSM Ger- 
many graph!, which we consider as undirected and unweighted. We note, that 
these graphs contain many nodes of degree-1 (dead-ends). This is very beneficial 
for our hull set computation algorithms, as all of them need to be contained 


' https: //illwww.iti-kit.edu/resources/roadgraphs.php 
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in any feasible hull set. Therefore, we also consider the corresponding network 
instances in which we recursively delete all nodes of degree-1 until the minimum 
degree in the graph is 2. Those should pose more difficult instances. Table 1 
provides an overview of the characteristics of the used graphs. 


Table 1. Road network benchmark data. The last column (51) denotes the percentage 
of degree-1 nodes. 


Name n m él 
ROADI1s51 99,127 105,517} 7% 
ROAD 152 73,185 79,575 | 0% 
ROAD23;| 290,659} 300,967 | 8% 
ROAD2s2| 180,875 191,183 | 0% 
ROAD383; | 990,732 | 1,057,821 | 6% 
ROAD3s2| 783,079 850,168 | 0% 
ROADAS, | 3,638,604 | 3,794,477 | 8% 
ROADASs | 2,530,393 | 2,686,266 | 0% 


Sensor Networks. The sensor networks were obtained by choosing n random 
node positions in the unit square and then connecting node pairs with an edge 
if their Euclidean distance is at most r = c/,/n for some constant c. We used 
c = 2 and c = 3 to end up with sensor networks similar to those used in other 
simulations. Figure 6 shows two examples of the resulting networks. We use the 
following nomenclature for the generated graphs: SN|n]c[c]. Hence the two graphs 
in Fig.6 would be referred to as SN500c2 and SN500c3, respectively. Presented 
results for sensor networks are always averaged over 10 random networks of the 
specified size with the same c value. 


Fig. 6. Sensor networks with 500 nodes together with heuristic hull sets (red). In the 
sparser graph on the left (c = 2, average degree 6), a hull of size 50 is depicted. In the 
denser graph on the right (c = 3, average degree 13), a hull set of size 37 is shown. 
(Color figure online) 
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5.1 Graph Hull Sets and Gin 


We discussed two methods to compute a valid hull set for a given graph G: 
computing the graph contour Ct(G) and a BFS-based heuristic. 


Contour Computation. We first evaluate the size of the contour and other rel- 
evant parameters on our benchmark instances. As their computation times are 
quadratic in n, we only consider the two smaller road network instances here, as 
well as sensor networks with up to 16,000 nodes (which already is a larger num- 
ber of nodes than considered in most sensor network simulations). The results 
are summarized in Table 2. Interestingly, for all our tested instances the value 
gin(Ct(G)) turned out to be equal to 2. There are significant differences in the 
quality of our upper bounds, though. For road networks, the diameter is rather 
large and also significantly larger than the radius or €(G). Hence our bounds 
turn out to be loose on such networks. But €(G) is also larger than the radius 
by more than 1, showing that Theorem 3 indeed might provide stronger bounds 
than Theorem 1. For the sensor networks, our bounds are better due to the sig- 
nificantly smaller diameters. But we also see that €(G) is always very close to 
the radius, hence Theorems 3 and 1 produce similar bounds. 


Table 2. Experimental results for parameter and contour computation for selected 
road and sensor networks. 


Name diam(G) | rad(G) | €(G) | |Ct(G)| | Time 
ROADI1s, | 1406 727 735 |12614 | 30min 
ROAD1sz2 | 1322 664 679 | 7567 | 16min 


ROAD2>51 | 2956 1487 1508 | 33991 |5h 
ROAD23s+2 | 2898 1449 1486 | 11682 |2h 


SN1000c2 28 15 17 183 | 0.1s 
SN1000c3 17 9 10 133 | 0.2s 
SN4000c2 55 28 29 630 | 2.9s 
SN4000c3 34 18 19 369 | 4.38 
SN16000c2| 112 56 57 | 2294 |99.9s 
SN16000c3| 67 34 35 | 1382 | 136.1s 


BFS-Based Heuristic. Next, we turn to our BFS-based heuristic, where the 
running time is linear as long as the number of iterations B is kept constant. 
We use B = 10 in the evaluation. The respective results for road networks are 
summarized in Table 3. For sensor networks of varying size, the lower bound and 
hull sizes are depicted in Fig. 7, once for c= 2 and once for c = 3. 
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Table 3. LResults for heuristic hull set computation on the road network instances 
with B = 10. Timings are in seconds. 


Name LB HS apx | Time 
ROADI1s1 7103 | 10381 |1.5 | 0.72 
ROADI1s2 998 4239 4.2 | 0.40 
ROAD2s; | 23218 | 30363 /1.3 | 4.98 
ROAD2s2 1831 8306 | 4.5 | 1.86 
ROAD3s; | 59514 | 98114 | 1.6 | 13.69 
ROAD3s2 8986 | 45910 |4.9 | 8.54 
ROAD451 288324 | 409867 1.4 | 44.64 
ROAD4s2 | 23477 | 121278 | 5.2 | 18.72 


For the road network instances, we observe that (as expected) the lower and 
upper bound are much closer for the instances which include dead-ends. But 
in these graphs also the hull set sizes are significantly larger. In the pruned 
graphs, the quality guarantee is about a factor of 5. The computation time for 
the lower bound was always comparable to that of the heuristic, and on average 
over 60% of the investigated subgraphs (chosen as Voronoi cells induced by the 
heuristic solution) that contributed to the lower bound value had a cut size of 3 
or more. That means that without Lemma 2, the lower bounds would have been 
significantly weaker. For sensor networks, we see that the hull set sizes for c = 2 
are larger than for c = 3 but at the same time the lower bounds are much 
better. This makes sense as with c = 3 induced subgraphs typically have many 
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Fig. 7. Results for heuristic hull set computation for sensor networks with B = 10 in 
dependency of the number of nodes. Note the logscale on the y-axis. 
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cut nodes, and given the high ambiguity of shortest paths in these networks, the 
certificate from Lemma 2 is rarely issued. For c = 2, the quality guarantee ranges 
from a factor of 3 to 12, showing that our computed hull sets are sensible. Note 
that here 100% of the induced subgraphs that contributed to the lower bound 
had a cut size of 3 or more. Hence without Lemma 2, we could not have gotten 
a value larger than the trivial lower bound of 2. 

Remarkably, across all instances - road networks and sensor networks alike - 
the size of the BFS-based hull set is always significantly smaller than the contour 
size. This is true even when we use B = 2 and thus have matching gin values. 
There, we get reductions around 10%-50%. Accordingly, our heuristic is not only 
significantly faster than the contour computation, but also yields better results 
and further allows us to trade running time (and gin value) for solution size. 


6 Conclusions and Future Work 


We demonstrated in this paper that hull sets of good quality (with bounded gin) 
can be computed efficiently even in large networks. It still would be interesting to 
design or rule out approximation algorithms for the hull number. Furthermore, 
our new upper bounds for the gin of the graph contour might help to search 
for instances with gin(Ct(G)) > 3, which are currently unknown (to exist). The 
established bounds imply, for example, that a graph with gin(Ct(G)) = 4 has 
to have a contour of size at least 4 and a diameter of at least 8. 
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Abstract. In this paper, we study the single-round geodesic Voronoi 
games on various classes of polygons and polyhedra for two players. We 
prove some tight bounds on the payoffs, that is, the number of clients 
served by both the first player, Alice, and the second player, Bob, for 
orthogonal convex polygons and polyhedra for the Li metric. 


Keywords: Facility location - Computational geometry - 
Combinatorial optimization - Voronoi game + Convex polygon - 
Orthogonal polygon - Rectilinear polygon - Orthogonal convex 
polyhedron 


1 Introduction 


The competitive facility location problem is a fundamental geometric optimiza- 
tion problem. The Voronoi games, introduced by Ahn et al. [1], are a subclass of 
competitive facility location problems that deal with two or more players taking 
turns in placing their facilities while optimizing their payoffs. A Voronoi game G 
consists of a playing field M in which there is a client area C C M that demands 
a service provided by two competitive players (named as Alice and Bob). In each 
round of the game, Alice places a set of facilities in M at a time, followed by 
Bob, to serve the clients. A client avails service from its nearest facility according 
to the distance metric considered in the model. 

In this paper, we consider a geodesic Voronoi game in the Ly-space for a 
polygon P and a fixed set of points C as clients in the plane. The polygonal 
region of P is supposedly owned by Alice. The clients in C are represented by 
points in the plane. Bob can only serve the clients in the exterior of the polygon 
P using exterior geodesic paths. Since Bob can only serve the clients exterior 
to the restricted region P, and hence compete only for these exterior clients, we 
only consider the clients that are in the exterior of the polygon P. Therefore, 
the metric for Bob is the external geodesic Lj distance. Here the interior and 
exterior of the polygons and polytopes are closed unless specifically mentioned. 

The focus of this paper is on the single-round Voronoi game, where both 
the players place only one facility each. Alice places her facility A first in the 
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interior of an orthogonal convex polygon P. A set S C R¢ is orthogonal con- 
vex if its intersection with any orthogonal line, i.e., axis-parallel line, is convex 
[Unger [11]]. Then Bob places his facility in the exterior of P. Alice and Bob 
serve the clients using unrestricted paths and external geodesic paths, respec- 
tively, that are shortest in the Li; metric. The payoffs of Alice and Bob, denoted 
by S,4 and Sz, respectively, also called their scores, are the total number of 
clients in their respective Voronoi regions. In case of a tie between the two near- 
est facilities, the client is equally served by both the players and counted as half 
in both the payoffs. The two optimization problems in the Voronoi game are to 
maximize the payoffs of Alice and Bob. 

We use several interesting combinatorial and geometric techniques to prove 
the upper and lower bounds for the payoffs of Alice and Bob when the region 
possessed by Alice are orthogonal convex polygons (see Fig.1) and orthogonal 
convex polyhedra in the L,-space. 


1.1 Previous Results 


Ahn et al. [1] described and solved the Voronoi game in line segments and 
circles in which Alice and Bob place an equal number of facilities, in one or 
more rounds, on line segments or circles. The payoff is the total length of their 
respective Voronoi cells. Cheong et al. [8] solved the single-round Voronoi game 
in a square in which Alice and Bob place multiple facilities at once, Alice before 
Bob, and the payoff is the total area of their respective Voronoi cells. Later, 
Fekete and Meijer [9] improved on the finer details of their solution and showed 
the intractability of the Voronoi games in simple polygons with holes and the 
payoff is the total polygonal area of their respective Voronoi cells. 

Banik et al. [3] studied the discrete version of the Voronoi game on line 
segments with a finite number of point clients, and the payoff is the total number 
of clients in Alice and Bob’s individual Voronoi cells. They proved bounds on 
the payoffs and designed optimal strategies for Alice and Bob. Banik et al. [2], 
and later, de Berg et al. [7], proposed algorithms for the discrete version of the 
single-round Voronoi game in R! in which Alice and Bob place multiple facilities 
in a single round and the payoff is the number of clients. Banik et al. [4] studied 
the discrete version of the single-round Voronoi game in R? in which Alice and 
Bob place new facilities among their previously owned facilities, and the payoff 
is the total number of clients in their respective Voronoi cells. They present 
polynomial-time algorithms for an optimal placement for Alice and Bob for each 
of the Ly, Lz and L., metrics. Later, Banik et al. [6] studied the discrete version 
of the single-round Voronoi game in a simple polygon in which the game arena 
is the internal geodesic space of a simple polygon, and the payoff is the total 
number of clients in their individual Voronoi cells. They devised polynomial-time 
algorithms for an optimal placement of both Alice and Bob for this game. 

Recently, Banik et al. [5], introduced the Voronoi game on a polygon in 
which Alice and Bob play the Voronoi game on the boundary of simple or convex 
polygons and polyhedra, the metrics are internal and external geodesic Lz metric, 
and the payoff is the number of clients in their respective Voronoi regions. They 
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proved tight upper and lower bounds for the payoffs of both for the single- 
round or k-round, simple or convex, polygons or polyhedra. They also devised 
algorithms for the optimal placements for Alice and Bob for the single round 
Voronoi game on a convex polygon. 


1.2 New Results 


In this paper, we study the same Voronoi game problems as in [5], termed the 
geodesic Voronoi games, but for the L; metric in place of the Euclidean met- 
ric Ly. The convexity is similarly replaced by orthogonal convexity. We take 
up questions regarding the Voronoi game, such as if Alice and Bob are guaran- 
teed some payoff, and if not, whether there exist some preconditions that may 
guarantee a payoff, what are these preconditions, how much is the guaranteed 
payoff, etc. We assume both the players play with their optimal strategies in 
this model. We show that if the clients are only on the boundary of the given 
convex orthogonal polygon, then Alice is always guaranteed to get at least half 
of the clients in every case, whereas Bob can only ensure one-sixth of the total 
number of clients in the worst case (Sect.3). We also prove that this bound for 
Alice is [4] and for Bob is [34], when we extend our model in 3-dimensional 
space (Sect.4). We also show examples to ensure the tightness of the bounds. 
In addition, we outline several interesting properties of the Voronoi regions of 
Alice and Bob in this metric space (Sect. 2). 


2 Preliminaries 


Bob’s maximization of his payoff in any single-round Voronoi game depends on 
where Alice places her facility first. Hence Bob’s strategy is easier to understand. 
Bob’s naive strategy will be to search among all possible feasible locations, say 
F, and choose that location that maximizes his payoff. Alice’s maximization 
of her payoff is comparatively more complex. Since Alice cannot predetermine 
Bob’s choice afterwards, she has to preempt Bob’s choice and needs to choose 
a location preparing for the worst; thus, she puts her facility among all possible 
locations in F where her minimum payoff is maximized. It is easy to see that 
the game, as described above is a constant-sum game, and hence the sum of the 
payofts of Alice and Bob is the total number of clients. 


Observation 1. Let Si and S% denote the final scores of Alice and Bob respec- 
tively in a single-round geodesic Voronoi game when they play optimally. Then, 
Si + SB =n. 


2.1 Orthogonal Convex Polygons for the L,; Metric 


Let A and B denote the optimal placement of the facilities by Alice and Bob, 
respectively, in the single-round geodesic Voronoi game G. Let P denote the 
orthogonal convex polygon in the game. Let d, and dg be the distance metrics 
for Alice and Bob defined as before. First, we mention a property of geodesic 
paths on orthogonal polygons with the L; metric, see Fig. 1. 
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Observation 2. The closed boundary of any orthogonal convex polygon can be 
non-uniquely partitioned into at least two and at most four monotonic L,-paths 
(staircases). The anti-clockwise boundary from the maximum x-coordinate to 
the maximum y-coordinate, on the top right of P, is called the (++) -quadrant 
OP boundary. Likewise, we define the (-+)-quadrant OP, (+-)-quadrant OP and 
(--)-quadrant OP boundaries. This partitioning can be proved by induction. 


° bisector(A, B) e B 
9 
(+ +)-quadrant OP boundary e vor(B) i 
Cat) eo Stop-left oP ee 
xy-monotonic path PA 
_s 


bottom-left AP 


L£y-convex P 


Fig. 1. An orthogonal convex polygon Fig. 2. Bob can move along the path 
P with the boundary divided into four BB* to the boundary OP to increase 
xy-monotonic parts. his payoff. 


Next, for geodesic Voronoi games, we make an important observation about 
boundary and non-boundary geodesic paths similar to [10]. 


Lemma 1. Lets and t be any pair of points on the boundary OP of any orthog- 
onal convex polygon P. There exists an external geodesic L1-path between s and 
t that is a part of the boundary OP. 


Proof. The external geodesic path can be deformed smoothly to a boundary 
path of same length due to the orthogonal convexity of P and Observation 2. 


Corollary 1. [f an external geodesic path in the Li metric, for an orthogonal 
conver polygon, does not touch the boundary OP of the polygon then the path 
length is the Ly distance between the end-points. 


Thus there exists external geodesic paths that are either completely free from the 
boundary OP of the orthogonal convex polygon P or contains only one connected 
part of the boundary OP. Next, we characterize the optimal locations of Alice 
and Bob that maximize their payoffs. We note that whenever Alice places her 
facility in the region she owns, Bob has to place the facility as near to the region 
as possible. By moving nearer, Bob can gain more clients. 


Lemma 2. [f Alice’s facility A is in the (closed) interior of the orthogonal con- 
vex polygon P in the game G, then there exists a point B* on the boundary OP 
of P that maximizes Bob’s payoff. 
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Proof. Let B be any point in the exterior of the polygon P. Without loss of 
generality, we assume that B is on the top-right of P. We consider the geodesic 
bisector of A and 6 using their respective distance metrics. Because of the stair- 
case nature of the first quadrant boundary, we can show that 6 can be moved 
successively vertically below or horizontally left, and then bottom-left direction 
at an angle of 7/4 towards P such that the Voronoi region of 6 enlarges or 
remains the same. For example, in Fig.2 Bob first moves vertically down and 
then obliquely towards the bottom-left. 


There are certain conditions when Alice has to place her facility in the orthogonal 
convex polygon P. This happens when the clients are on the boundary of P. 
The orthogonal convex hull of a set of points S is the smallest orthogonal convex 
polygon that contains S. We state a more general condition below. 


Lemma 3. There exists a point A*, that maximizes Alice’s payoff, in the 
(closed) interior of the orthogonal convex hull of C in the game G. 


Proof. For any point A in the exterior of an orthogonal convex hull there exists 
a point A’ on the hull such that the distance from any point in the hull to A 
is not less than to A’ (A’ is the point nearest to A in one of the eight cardinal 
directions). Thus A’ increases the payoff of Alice compared to A. An interior A* 
improves on A’. 


Since the orthogonal convex hull of a set of points on the boundary of the 
orthogonal convex polygon P is contained in P, we have the following lemma. 


Lemma 4. If all the clients C are on the boundary of the orthogonal convex 
polygon P in the game G, then there exists an optimal facility location of Alice, 
that maximizes her payoff, in the (closed) interior of P. 


The playing field M is partitioned by Alice and Bob by the clients that they 
serve, respectively grouped in their Voronoi regions. We state some conditions 
when these regions are connected or disconnected in the following lemmas. We 
denote the Voronoi regions of Alice’s facility A and Bob’s facility 6 as vor(A) 
and vor(8), respectively. We call them Alice’s and Bob’s Voronoi regions, respec- 
tively. See Figs.3 and 4. 


Lemma 5. Let dy and dg be any two metrics over M. Let Alice and Bob follow 
d, and dg metrics respectively for their payoffs. If da(x,y) < dp(a,y), for every 
x,y © M, then vor(B) is connected in the Voronoi diagram of the set {A, B}. 


Proof. Assume vor(B) is not connected. 6B is in vor(B) as dg(6,B) = 0 and 
d,(A, 8) > 0. We consider a point p € vor() not in the connected region to 
which B belongs. Since p € vor(B), dg(B,p) < d4(A,p). Next, we consider a 
geodesic path a from 6 to p which connects the two disconnected regions of 
vor(B) (and which may not be unique). We can select a point q in the path 7 
not in vor(B) due to the disconnectedness of vor(B). Then, d4(A, q) < dg(B,q). 
From the premise, d4(p,q) < dg(p,q). Using the geodesic path 7, these inequal- 
ities and the triangle inequality, we get dg(B,p) = dg(B,q) + dg(p,q) > 
da(A, q) + da(p,q) > da(A,p), which leads to a contradiction. 
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vor(.A) } fe, 


Fig. 3. Bob’s Voronoi region is con- Fig. 4. Alice Voronoi region is discon- 
nected in geodesic Voronoi game G in nected in (i) and Bob’s Voronoi region 
1. Bob’s Voronoi region on the bound- on boundary of P is disconnected in (ii) 
ary OP is also connected. in G. 


The premise of Lemma 5 holds true for any geodesic Voronoi game, therefore 
Bob’s Voronoi region is connected in G. Alice’s Voronoi region is also connected 
conditionally. Bob’s Voronoi region on the boundary OP is also connected when 
Bob places his facility on the boundary. We state these facts below. 


Lemma 6. Bob’s Voronoi region is connected in the geodesic Voronoi game G. 


Proof. da(x,y) < de(z,y), z,y € R?. 


Lemma 7. If Alice’s facility A is in the interior of P in the geodesic Voronoi 
game G then Alice’s Voronoi region is connected and may be disconnected oth- 
erwise. 


Proof. The proof is similar to the proof of Lemma 5 above if we exchange A and 
B. However, the substituted premise dg(x,y) < d(x, y) does not hold for every 
x,y © M. We give the proof sketch below. We assume that A is in the interior 
of P and vor(A) is disconnected. Let p € vor(A) be disconnected from A. We 
consider the geodesic path from A to p and the point q to be the last exit point of 
that path from vor(B). d4(p,q) = dg(p,q) follows from Corollary 1 as both are 
Li distances. d4(A,p) = da4(A,q) + da(p,q) = ds(B,q) + de(p,4q) = ds(B,p), 
which contradicts p € vor(A). See Fig.4 for an idea on how vor(A) might be 
disconnected when Alice’s facility A isn’t in the interior of P. 


Lemma 8. Let P be the orthogonal polygon in a geodesic Voronoi game G for 
the Ly metric. If Bob’s facility B is on the boundary of P, then the Voronot 
region of Bob on the boundary, PN vor(B), is connected and maybe disconnected 
otherwise. 


Proof. If B is on the boundary, then for any point p € PM vor(B), the geodesic 
path from B to p will be on the boundary and in P NM vor(B). Therefore PM 
vor(B) will be connected. See Fig. 4 for an idea on how vor(B) on OP might be 
disconnected when Bob’s facility B isn’t on the boundary. 


We note that if Bob’s Voronoi region on the boundary of the Lj-convex polygon 
P is connected, Alice’s Voronoi region is also connected. 
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2.2 Orthogonal Convex Polyhedra for the L; Metric 


First, we make some observations on the structure of the axis-parallel orthogonal 
convex polyhedra. 


Observation 3. Similar to the case of plane, the surface OP of the axis parallel 
orthogonal convex polyhedra can be divided into eight parts (staircase like) in 
R°. We call them (+++)-octant OP, (-++)-octant OP, ..., (---)-octant OP 
surfaces. 


We consider a geodesic Voronoi game G for an axis parallel orthogonal convex 
polyhedra P and clients C. In R?>?, the condition that Alice’s optimal facility 
location has to be inside the orthogonal convex hull of C does not hold. We 
generalize Lemma 3 as following for R?. 


Observation 4. Any point A* that maximizes Alice’s payoff in a geodesic Voronoi 
game G for an orthogonal convex polytope in R?2? for Ly, is in the (closed) 
interior of the smallest hypercube that contains C. 


We use the properties of the L; metric to prove Observation 4. Contrary to our 
expectations, even if Alice is in the interior of P, neither the optimal placement 
of Bob that maximizes his payoff might be on OP nor the Voronoi region of 
Alice on OP might be connected. See Fig.5 where both Alice’s as well as Bob’s 
optimal facility location is in the exterior. 


= n/3 clients 


pe vor(.A) 
Z. x 


Fig.5. Both Bob’s and Alice’s opti- 
mal facility location is at the corner 
extended on the exterior in the geodesic 
Voronoi game for L; in R®. 


Fig. 6. Alice’s Voronoi region on the 
surface OP is disconnected in the 
geodesic Voronoi game for L,. Alice’s 
facility A may be slightly below or 
above the surface to be in the strict 
interior or strict exterior, respectively. 


Lemma 9. There exists a geodesic Voronoi game in R42° for an orthogonal 
conver polytope P, such that there exists no optimal placement for Alice or Bob 
on the surface OP, even if the clients C are on the surface OP. 
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Next, we consider the connectivity of the Voronoi regions of Alice and Bob. 
We have already shown that Bob’s Voronoi region, if Bob is on the surface 
OP, is connected on the surface. However, since Bob’s optimal placement may 
be on the exterior of the polytope, his Voronoi region on the surface might be 
disconnected. Moreover, even if Bob is on the surface and Alice anywhere, Alice’s 
Voronoi region on the surface may be disconnected (see Fig. 6). 


Lemma 10. Alice’s Voronoi region may be disconnected on the surface OP of 
the orthogonal convex polyhedron P in a geodesic Voronoi game in R4=?, whether 
A is in the interior, exterior or on the boundary of P. This also holds for the 
Alice’s optimal facility location A*. 


3 Bounds for Orthogonal Convex Polygons 


Let Alice and Bob play a single-round Voronoi game in the L; metric with a 
set of n clients in the plane. First, we prove some bounds for the unrestricted 
case in which Alice, like Bob, does not own any region. Subsequently, we prove 
the bounds for the geodesic Voronoi games for orthogonal convex polygons with 
the clients on the boundary. We note that the unrestricted case is sometimes 
referred as the discrete Voronoi game in the literature. 

Observe that in the unrestricted single-round Voronoi game in the L; metric, 
Bob is guaranteed to serve at least > clients by placing his facility precisely at 
the exact location as that of Alice’s. On the other hand, as shown in the lemma 
below, Alice is also guaranteed to serve 4 clients, if she places her facility at one 
of her optimal locations, which are the intersections of vertical and horizontal 
lines that contain at most [4] clients in each of their open halves. 


Lemma 11. In a single round Voronoi game in the Ly metric in the plane with 
n clients, there exists a placement A* that ensures a payoff of } for Alice. 


Proof. Let us consider the horizontal and vertical lines that divide the clients 
in half, i.e., the lines are such that each open half contains at most | 4] clients. 
Let us assume that there is at most one point on both the lines for the sake of 
simplicity. Alice places her facility on the intersection point. Let the number of 
clients in four quadrants be x44, v+-, v-+ and w--. Then, r44+%4- = %4-+"-- = 
C+ Hy = He ty = Ese The Voronoi region of Alice, wherever Bob places 
his facility, will either contain two neighboring quadrants, contain one quadrant 
and share its two neighboring quadrants or share all four quadrants. In either 
case, it will ensure a payoff of }, including the share of the client at the facility 
location itself, if any. This follows from the equalities above. The case when there 
are clients on the lines can be analyzed similarly. 


Thus, in every Voronoi game in the plane for the L, metric without restrictions, 
the optimal payoff of Alice and Bob is always equal. 


Theorem 1. The optimal payoffs of Alice and Bob in any single-round Voronoi 


game in the plane for the Ly metric without restrictions for a set of n clients 
is 
2 


Voronoi Games 203 


3.1 Orthogonal Convex Polygon with Clients on Boundary 


Let G be a geodesic Voronoi game such that the clients in C are located on the 
boundary OP of the orthogonal convex polygon P that Alice owns. The metric is 
Ly; as before. Alice and Bob can place their facilities anywhere in plane. However, 
we have already shown in Lemma 4 that there always exists an optimal placement 
of Alice inside P, so we only consider Alice’s placements inside P. On the other 
hand, Bob can get any client only if he places his facility outside P, so we 
only consider Bob’s placements outside P. We note that, since Bob’s distances 
to clients are not less than the distances in the equivalent unrestricted game 
without P. So, following Lemma 11, Alice naturally has a guaranteed payoff of 
5: Surprisingly, however, we can show that there exist games such that Alice 
gets no more than 5 clients. We prove this tight bound below. 


Lemma 12. Alice’s optimal payoff is at least 5 in any single-round geodesic 
Voronoi game for orthogonal convex polygon P, clients C on OP and the Ly 
metric. 


Proof. The proof follows from Lemma 11 and the fact that d4 remains same 
whereas dg increases in the argument there. 


We next show that this bound is tight. 


Lemma 13. There exists a single-round geodesic Voronoi game for a orthogonal 
convex polygon P and n clients on OP in the Ly metric such that Alice’s optimal 


payoff is 5. 
Proof. We construct a game where Alice owns a square region. We place | 4| 
clients on one vertex, the other [4] clients on the diagonally opposite vertex 


and a last remaining one client, if it exists, on one of the other two vertices. 
Wherever Alice places her facility, Bob can get at least } clients by placing a 


facility on this last vertex (even if n is even and there is no remaining client). 


Next, we show a non-trivial bound for Bob. See Figs. 7 and 8. 


Pe PY vor(A)nvor(B) 
é some clients shared 
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cs af wa x 

< £ i 


eee Se = § clients 
client c € C moved 


boundary OP moved 


no clients shared 


Fig. 7. Why Bob’s payoff is at least §? Fig. 8. Bob’s payoff is at most [4]. 
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Lemma 14. Bob’s optimal payoff is at least [§| in any single-round geodesic 
Voronoi game for any orthogonal conver polygon P and n clients on OP in the 
Ly, metric. 


Proof. As a result of Lemma 4 we can assume that Alice’s location A* is in 
the interior of P. We note that we can move the boundary of P together with 
the clients and 6* in each of the four quadrant boundaries mentioned previously 
towards A* at an angle 7/4 with axes, such that the distances to Bob from every 
client remain same, but to Alice, it might decrease. We do not move the four 
x and y extremes. We can show this using the properties of the L; metric and 
the geodesics. We move till the polygon remains simple, i.e., the boundaries do 
not cross the other boundaries. If we show that Bob gets | §] clients for this 
new polygon then it gets [%]| clients for the polygon P. Let us rename the new 
polygon as P for this proof. See Fig. 7 for an illustration. 

We observe that out of the four quadrant boundaries of P, only two, and only 
the opposite ones, can intersect the horizontal and vertical lines passing through 
A*. This is because we have two axes and four intersections around A* by 
the orthogonal convex polygon boundary (we argue using a combined reasoning 
of pigeon hole principle and partition function p(4) = 5). We consider Bob’s 
placement in each quadrant boundary. If a quadrant boundary, OQ, intersects 
both the axes, we can show that Bob’s best possible location for OQ gets at 
least half of the clients in 0Q. Otherwise, if a quadrant boundary, 0Q, does not 
intersect both the axes, Bob’s best possible location for 0Q can get all the clients 
in OQ, since 0Q will be fully inside the Bob’s Voronoi region, i.e., 0Q € vor(B). 
Thus, Bob has four candidate locations to consider optimality, at most two 
candidates getting at least half, and at least two candidates getting all of their 
respective quadrant boundary clients. This is because of the pigeonhole principle, 
as each axis can intersect only two quadrant boundaries due to orthogonality. 

The problem of maximizing Bob’s payoff can be written as a min-max 
optimization problem: say, without loss of generality and in the worst case, 
S% > MiNn,, nyn-jn-+ Max{Ns4, N4_/2,n__, n-+/2}, where n4,+n4-+n_-+n, =n 
and each n+, ns-,n--,n-+ > 0. The numbers ni+, ns, n--, n-+ are the number 
of clients in the respective quadrant boundaries (we assume for simplicity that 
the boundaries do not share clients). We solve this optimization problem to get 
Sp = %- Since Alice can be made not to touch any two quadrant boundaries at 
the same time, by choosing quadrant boundaries carefully, then the best candi- 
date of Bob, 6*, is in the shared boundary that does not have A* and contains 
an odd number of clients. We can show, by counting, that Bob will have at least 
one exclusive client. In such a case we can show that Sg > [%] in the above 
maximization problem by a careful analysis. 


In Fig. 8 we present an example to show that this bound is tight. 


Lemma 15. There exists a single round geodesic Voronoi game for an orthog- 
onal convex polygon P and n clients on OP in the Ly metric such that Bob’s 
optimal payoff is | § |. 


We summarize the bounds in the theorem below. 
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Theorem 2. Alice’s optimal payoff is at least and Bob’s optimal payoff is at 
least [§| in any single-round geodesic Voronoi game G in the Ly metric for an 
orthogonal convex polygon P and n clients on the boundary OP of the polygon. 
Both the bounds are tight. 


As the matters stand, we can also show that similar tight bounds hold if the 
clients are unrestricted in the plane. 


4 Bounds for Orthogonal Convex Polyhedra 


This section extends the results on the bounds on orthogonal polygons to 3-space. 
We observe that the geodesic Voronoi games for orthogonal convex polyhedra 
differ significantly from the games for orthogonal polygons in the L; metric. 
Let G be a geodesic Voronoi game and let P and C be the orthogonal convex 
polyhedra and the set of n clients, respectively. 

In the unrestricted case, similar to the case of the plane, Alice is guaranteed 
a fraction of clients, and dissimilar to the case of the plane, Alice wins only 
a small fraction of clients. We compute the three orthogonal planes parallel to 
xy-plane, yz-plane and xz-plane, respectively, that contain at most [4] clients 
in their open halves. Alice is guaranteed a payoff of [4] if she puts her facility 


at the intersection point of these three planes. 


Theorem 38. Alice’s optimal payoff is at least [| and Bob’s optimal payoff is 
at least 5 in any single-round Voronoi game G in the Ly metric for n clients. 
Both the bounds are tight. 


Proof. We assume, for the sake of simplicity that the orthogonal planes contain 
at the most one point. We can show that the eight quadrants have the number 
of clients satisfying the following relations for some m: n444 = n--- + M, Ny = 
Nore $M, Noe = Ne-+ +m, and n--+ = Na+-+m. The best placement for Bob will 
be infinitesimal near Alice’s placement. Then vor(A) either shares whole space, 
shares four octants and contains two, contains four octants on one side of one of 
the three orthogonal planes mentioned in the discussion, or contains four octants 
that are three neighbors of the remaining one. In either case, we can prove that 
Alice wins at least [4] clients. 

We can construct an example where the bounds are tight such that m = [4]. 
We distribute equal number of clients at uniform intervals in four rays in four 
quadrants © = y= z>0,% = -y = -z > 0, -% = y = -z > 0 and 
—x£ =-y=z> 0, and put at least one client at the origin, ifm mod4 ¥ 0. 


4.1 Orthogonal Convex Polyhedra with Boundary Clients 


Consider the case of the geodesic Voronoi game when the clients are located on 
the boundary of convex polyhedron P. We again analyze the intersection point 
of orthogonal planes mentioned previously for Alice’s optimal location. From 
those arguments, we have that Alice’s payoff is at least 4. 
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The lower bound is not tight. We construct a geodesic Voronoi game in which 
Alice’s payoff is not more than [ 21, implying that the lower bound is not more 
than [432]. We state this in the following lemma. 


Lemma 16. There exists a single round geodesic Voronoi game in the Ly metric 
for a convex polyhedron P with n clients on the boundary OP such that Alice’s 
optimal payoff is [YZ]. 


Proof. We prove this by distributing points (i.e., clients) on the center of faces of 
a cube and its corners, forming a tetrahedral. We place 5n/26 clients on each of 
the four tetrahedral corners and n/26 clients on each of the six centers of faces. 
See, Fig. 9 for an illustration. 


0KgKacK<ax<b<c<l 


5a i3 er 
5g Ments et 36 Clients 


nae clients 


= 


n/48 clients 
A* is at center 


A* is at center 
B* is between two sets of nearest group of n/48 points 


| 


Fig. 9. Alice’s payoff is at most [ Fig. 10. Bob’s payoff is at most [4] 


in the geodesic Voronoi game for Ly. in the geodesic Voronoi game for L1. 


Next we show that Bob’s payoff is at least [54] in the lemma below. 


Lemma 17. Bob’s optimal payoff is at least [34] in any single-round geodesic 
Voronot game for an orthogonal convex polyhedron P and n clients on the bound- 
ary OP in the Ly metric. 


Proof. Let Alice place her facility at A. We consider the eight octant surfaces 
OP of the orthogonal convex polyhedron P as mentioned in the Observation 3. 
We can prove that three points on each octant surface OP are sufficient so that 
OP is included in vor(B)'s for any A. Thus a total of 24 6’s are sufficient to 
cover the whole of OP in vor(B)’s. Bob places his facility in that location for 
which vor(8) contains the maximum number of clients. 


We also show that this lower bound of | 54] is tight by constructing a geodesic 
Voronoi game where Bob does not win more than | 54] clients. 


Lemma 18. There exists a single round geodesic Voronoi game in the Ly metric 
for a convex polyhedron P with clients on the boundary OP such that Bob’s 


optimal payoff is | 3; |- 
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Proof. We construct a cube with double hollowed corners along with a smaller 
cube in the center and distribute n/48 clients each at the six corners nearer to 
the center, as in Fig. 10. 


We summarize our results on Voronoi games on orthogonal convex polyhedra 
with boundary clients in the theorem below. 


Theorem 4. Alice’s optimal payoff is at least [4%] and Bob’s optimal payoff 
is at least [34] in any single round Voronoi game G in the Ly metric for an 
orthogonal convex polyhedron P and n > 2 clients on the boundary OP of P. The 
lower bound of Bob’s optimal payoff is tight. Moreover, the tight lower bound of 


Alice’s optimal payoff is at most [21 
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Abstract. For a given integer k > 2, partitioning a connected graph 
into & vertex-disjoint connected subgraphs of similar (or fixed) orders is 
a classical problem that has been intensively investigated since late sev- 
enties. A connected k-partition of a graph is a partition of its vertex set 
into classes such that each one induces a connected subgraph. Given a 
connected graph G = (V, £) and a weight function w : V — Qs, the bal- 
anced connected k-partition problem looks for a connected k-partition 
of G into classes of roughly the same weight. To model this concept 
of balance, we seek connected k-partitions that either maximize the 
weight of a lightest class (MAX-MIN BCPx) or minimize the weight of 
a heaviest class (MIN-MAX BCP,,). These problems, known to be NP- 
hard, are equivalent only when k = 2. We present a simple pseudo- 
polynomial £-approximation algorithm for MIN-MAX BCP, that runs in 
time O(W|V||E|), where W = 5°.) w(v); then, using a scaling tech- 
nique, we obtain a (polynomial) (f + €)-approximation with running- 
time O(|V|?|E|/e), for any fixed e > 0. Additionally, we propose a fixed- 
parameter tractable algorithm for the unweighted MAX-MIN BCP (where 
k is part of the input) parameterized by the size of a vertex cover. 


Keywords: Balanced connected partition - Approximation algorithm - 
Parameterized algorithm 


1 Introduction 


The problem of partitioning a connected graph into a given number k > 2 of 
connected subgraphs with prescribed orders was first studied by Lovasz [16] and 
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Gyé6ri [15] in the late seventies. Let [k] denote the set {1,2,...,k}, for every 
integer k > 1. A connected k-partition of a connected graph G = (V,F) isa 
partition of V into nonempty classes {V;}*_, such that, for each i € [k], the 
subgraph G[V;] is connected, where G[V;] denotes the subgraph of G induced by 
the set of vertices Vj. 

We denote by (G, w) a pair consisting of a connected graph G = (V, F) anda 
function w: V — Q> that assigns non-negative weights to the vertices of G. For 
each V’ C V, we define w(V’) = )0 cy, w(v). Furthermore, if G’ = (V’, E’) is a 
subgraph of G’, we write w(G") instead of w(V’). If P = {Vi}ieja) is a connected 
k-partition of G, then wt (P) stands for maxjep {w(Vi)}, and w7 (P) stands for 
mingea] {w(Vi)}- 

The concept of balance of the classes of a connected partition can be 
expressed in different ways. In this work, we consider two related variants whose 
objective functions express this concept. 


Problem. Min-Max Balanced Connected k-Partition (MIN-MAX BCP;) 
INSTANCE: a connected graph G = (V, £), and a weight function w: V — Qs. 
FIND: a connected k-partition P of G. 

GOAL: minimize wt (P). 


Analogously, MAX-MIN BCP, has the same set of instances as MIN-MAX 
BCPx, but it seeks a connected k-partition P that maximizes w7 (P). 

The problems MIN-MAX BCP2 and MAX-MIN BCP are equivalent, that is, 
an optimal solution for one of the versions is also an optimal solution for the 
other version (but they may have different optimal values). However, for k > 2, 
equivalence do not hold for MIN-MAX BCP, and MAX-MIN BCP,. 

For all problems mentioned here the input graph G is always simple and 
connected (and possibly with further properties). We also use the convention 
that n (resp. m) is the number of vertices (resp. edges) of the graph under 
consideration. 

Throughout this paper we assume that k > 2. When & is in the name of 
the problem, we are considering that k is fixed. The problems in which k is 
part of the instance are denoted similarly but without specifying k in the name 
(e.g. MAX-MIN BCP). The unweighted (or cardinality) versions of the prob- 
lems refer to the case in which all vertices have equal weight, which may be 
assumed to be 1. We denote the corresponding problems as 1-MIN-MAX BCP, 
1-MAX-MIN BCPx,, 1-MIN-MAX BCP and 1-MAx-MIN BCP. 

In this paper, we show approximation algorithms for MIN-MAX BCPx,, but 
mention approximation results for both MIN-MAX BCP, and MAX-MIN BCPx. 
We observe that whenever we refer to an approximation algorithm, we mean 
that it runs in polynomial time on the size of the instance. If an approximation 
ratio a can be guaranteed for an algorithm, but it may run in pseudo-polynomial 
time, we refer to it as a pseudo-polynomial a-approximation. This is not a usual 
terminology, but it will be appropriate for our purposes. 

Problems of finding balanced connected partitions can model a rich collec- 
tion of applications in logistics, image processing, data base, operating systems, 
cluster analysis and robotics [4,17,18,22]. 
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Dyer and Frieze [12] proved that 1-MAXx-MIN BCP, is NP-hard on bipartite 
graphs. Furthermore, 1-MAX-MIN BCP, has been shown by Chlebfkova [10] to 
be NP-hard to approximate within an absolute error guarantee of n'~*, for all 
€ > 0. For the weighted versions, Becker et al. [3] proved that MAX-MIN BCP2 
is NP-hard on grid graphs. Wu [21] showed that MAX-MIN BCP,, is NP-hard on 
interval graphs for every k. Chataigner et al. [7] proved that MAX-MIN BCP, 
is strongly NP-hard, even on k-connected graphs. Hence, unless P = NP, the 
problem MAX-MIN BCP, does not admit a fully polynomial-time approximation 
scheme (FPTAS). They also showed that, when k is part of the instance, the 
problem MIN-MAX BCP cannot be approximated within a ratio better than 6/5. 

For MAX-MIN BCP, (resp. MIN-MAX BCP,,), Perl and Schach [20] (resp. 
Becker, Schach, and Perl [5]) designed polynomial-time algorithm when the input 
graph is a tree. Also for trees, Frederickson [14] proposed linear-time algorithms 
for both MAX-MIN BCP, and MIN-MAX BCP. Polynomial-time algorithms were 
also derived for MAX-MIN BCP on graphs with at most two cut-vertices [1,10]. 
For MAX-MIN BCP, on ladders, a polynomial-time algorithm was obtained by 
Becker et al. [2]. 

Chlebfkova [10] designed a (4/3)-approximation algorithm for MAX-MIN 
BCP3. In 2020, Chen et al. [8] observed that the algorithm obtained by Chlebikova 
has approximation ratio 5/4 for MIN-MAX BCP (but requires another analysis). 
These authors also obtained approximation algorithms with ratio 3/2 and 5/3 for 
MIN-MAX BCP3 and MAX-MIN BCP3, respectively. In 2012, Wu [21] designed a 
FPTAS for MAX-MIN BCP, restricted to interval graphs. When k is part of the 
input, very recently Casel et al. [6] derived a very involved 3-approximation algo- 
rithm for both MAX-MIN BCP and MIN-MAX BCP, based on the crown decom- 
position of the graph. For recent exact algorithms based on mixed integer linear 
programs for these problems we refer the reader to Miyazawa et al. [19]. 


1.1 Our Contribution 


We show an approximation algorithm for MIN-MAX BCP,, k > 3, that was 
inspired by the k/2-approximation algorithm, designed by Chen et al. [9] for 
(the unweighted version) 1-MIN-MAX BCPx,. The algorithm we present here has 
basically the same approximation ratio: namely, k/2 + ¢, for any arbitrarily 
small ¢ > 0. When the weights assigned to the vertices of the input graph are 
bounded by a polynomial on the order of the graph, it achieves the ratio k/2. 
The additional constant ¢ in the ratio k/2 comes from a scaling technique used 
to deal with weights that might be very large. These results are presented in 
Sect.2. We note that a 3/2-approximation algorithm for MIN-MAX BCP3 was 
obtained by Chen et al. [8], but its analysis and implementation are slightly 
more complicated than the algorithm we show here. 

In Sect. 3, we prove that 1-MAX-MIN BCP is fixed-parameter tractable when 
the parameter is the size of a vertex cover of the input graph. The proposed 
algorithm is based on an integer linear program that has a doubly exponential 
dependency on the size of a vertex cover. To the best of our knowledge, no 
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FPT algorithm for balanced connected partition problems is described in the 
literature. We believe that the strategy used to model connected partitions may 
be useful to show that other problems involving connectivity constraints are 
fixed-parameter tractable when the parameter is the size of a vertex cover. 


2 Approximation Algorithm for Min-Max BCP, 


Chen et al. [9] devised an algorithm for 1-MIN-MAX BCP, with approximation 
ratio k/2. This algorithm iteratively applies two simple operations, namely PULL 
and MERGE, to reduce the size of the largest class. In what follows, we show 
how to generalize such operations for the weighted case to design a (# + €)- 
approximation for MIN-MAX BCPx, for any ¢ > 0. First we discuss the algorithm 
for the case k = 3, and then we show how to use the connected 3-partition 
produced by this algorithm to obtain a connected k-partition for any k > 4. 

Throughout this section, (G,w) denotes an instance of MIN-MAX BCPx,, as 
defined previously. Moreover, we assume without loss of generality that w is an 
integer-valued function (otherwise, we may simply multiply all weights by the 
least common multiple of the denominators). 

The following trivial fact is used to show the approximation ratio of the 
algorithms for MIN-MAX BCP, proposed here. When convenient, we denote by 
OPT; (J) the value of an optimal solution for an instance J of MIN-MAX BCPx. 


Fact 1. Any optimal solution for an instance I = (G,w) of MIN-MAX BCP, 
has value at least w(G)/k, that is, OPTx(L) > w(G)/k. 


For k > 3, let Gz be the class of connected graphs G containing a cut-vertex uv 
such that G—v has at least k— 1 components. We denote by c(H) the number of 
components of a graph H. The next lemma provides a lower bound for the value 
of an optimal solution of MIN-MAX BCP, on instances (G,w) with G € Gx. 


Lemma 1. Let I = (G,w) be an instance of MIN-MAX BCP, in which G € Gx, 
and v is a cut-vertex of G such that (G—v) =l>k—1. Let C = {Ci}ie(q be 
the set of the components of G—v. Suppose further that w(C;) < w(Ci41) for 
every i € [€— 1]. Then every connected k-partition P of G satisfies wt(P) > 
wv) + Vicwe—n41) W(Gi)- In particular, OPT, (I) = w(v) + Vice—n41j W(Gi). 


Proof. Consider a connected k-partition P of G, and let V* be the class in P that 
contains v. Let g* := |{C €C: V(C) C V*}| and q:= |{C €C: V(C) £ V*}}. 
Hence, q* + q = @ and q < k—1. Therefore, g®¥ = €—q > €—k+1. 
Since w(C,) < w(Cz) < ... < w(Ce), we conclude that w(V*) > w(v) + 
Viee—e+1) W(Ci), and thus, wt(P) > w(V*). Clearly, it holds that OPT;(J) > 


w(v) + Die fe—k+1] w(Ci). 


We now present an algorithm for MIN-MAX BCP3 that generalizes the algo- 
rithm proposed by Chen et al. [9] for the unweighted version of this problem. 
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We adopt the same notation used by these authors to refer to the core opera- 
tions of the algorithm. The strategy used in the algorithm is to start with an 
arbitrary connected 3-partition and improve it by applying successively (while 
it is possible) the operations MERGE and PULL, defined in what follows. 

We say that a connected 3-partition {V), V2, V3} of G is ordered if w(Vi) < 
w(V2) < w(V3). The input for PULL and MERGE is an ordered connected 
3-partition {Vi, V2, V3}. As these operations may be applied several times, a 
reordering of the classes is performed at the end, if necessary. In this context, 
we say that an ordered 3-partition P = {V,, V2, V3} is better than an ordered 
3-partition OQ = {X1,X2,X3} if w(V3) < w(X3). Two classes V; and V; are 
adjacent if there is an edge in G joining these classes. For X C V, we denote by 
N(X) the set of vertices in G— X that are adjacent to a vertex of X. 


— MERGE(P) 
— Input: an ordered connected 3-partition P = {Vi, V2, V3} of G. 
— Preconditions: (a) w(V3) > w(G)/2; (b) |V3| > 2; (c) Vi and V2 are 
adjacent. 
— Output: a connected 3-partition {V; U V2, V3, V3’}, where {V3, V3’} is an 
arbitrary connected 2-partition of G[V3]. Reorder the classes if necessary, 
and return an ordered partition. 


The 3-partition returned by MERGE is better than the input partition 
since w(V3) < w(V3), w(V3’) < w(V3) and w(Y) + w(V2) < w(G)/2 < w(V3). 

Note that a depth-first search suffices to check the preconditions. Moreover, 
a connected 2-partition of G[V3] can be easily obtained from any spanning tree 
of this graph. Hence, MERGE can be executed in O(|V| + |E]). 


— PuLu(P, U, 7?) 
— Input: an ordered connected 3-partition P = {Vi,Vo,V3} of G, a 
nonempty subset U of vertices, and i € {1,2}. 
— Preconditions: (a) w(V3) > w(G)/2; (b) U € V3, G[V; UU] and G[V3 \ U] 
are connected; (c) w(V; UU) < w(V3). 
— Output: a connected 3-partition {V;,V; UU, V3\U} where j € {1,2} \ {c}. 
Reorder the classes if necessary, and return an ordered partition. 


Note that PULL(P, U,2) outputs a partition that is better than P = {V1, Vo, V3}, 
since w(V3 \U) < w(V3), w(Vj) < w(V3) and w(V; UU) < w(V3). Moreover, 
it is only executed when a set U satisfying the preconditions is given. Thus, 
this operation can be executed in O(|V|) time. One may show that the time 
complexity to find such a set U © V3 (if it exists) is O(|V||E|). Let us denote 
by PULLCHECK the algorithm that receives as input an ordered connected 3- 
partition P = {Vi, V2, V3} of (G,w), and 7 € {1,2}, then outputs either a set 
U Cc V3 that satisfies the preconditions of PULL w.r.t. 7, or the empty set @) (if 
no such U exists). 
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Algorithm 1. MIN-MAx-BCP3 
Input: An instance (G, w) of MIN-MAX BCP3 
Output: A connected 3-partition of G 
Routines: MERGE, PULL and PULLCHECK. 


1: procedure MIN-MAx-BCP3(G, w) 

2: Let P = {Vi, V2, V3} be an ordered connected 3-partition of G; W = w(G) 
3: while w(V3) > W/2 do 

4: if Vi and V2 are adjacent and |V3| > 2 then 

5: P<—MERGE(P) #P = {Mi, V2, V3} 

6: else if PULLCHECK(P, i) returns a nonempty set U for 7 € [2] then 
7: P — PuLL(P,U,i) #P ={V1, V2, Vs} 

8: else 

9: break 

10: return P 


Lemma 2. Algorithm 1 on input (G,w), where G = (V,E) and w is an integer- 
valued function, finds a connected 3-partition of G in O(w(G)|V||E|) time. 


Proof (sketch). Each time a MERGE or a PULL operation is executed, the weight 
of the heaviest class decreases. Thus, at most w(G) calls of such operations are 
performed by the algorithm. Note that both MERGE and PULL operations take 
O(|V| + ||) time. The routine PULLCHECK has time complexity O(|V||E]). It 
follows that Algorithm 1 has time complexity O(w(G)|V||E]). 


It is clear that when Algorithm 1 halts and returns a partition P, one of 
the two cases occurs: (a) either the loop condition in line 3 failed, and in this 
case, P has value wt (P) < w(G)/2, or (b) neither MERGE nor PULL operations 
could be performed (and wt(P) > w(G)/2). In what follows, we prove that in 
case (b) the input graph has a particular “star-like” structure which allows us 
to conclude that the solution produced by the algorithm is optimal. 


Lemma 3. Let P = {Vi, V2, V3} be an ordered connected 3-partition produced by 
Algorithm 1, and let G; = G[Vj], for i = 1, 2,3. If |V3| > 2 and w(V3) > w(G)/2, 
the following hold: 


(i) w(Vi) < w(G)/4, and V; and V2 are not adjacent; and 

(ti) there exists u € V3 such that u is a cut-vertex of G, {G1,Go} CC, w(C) < 
w(V,) < w(V2) for each C € C\ {G,,G2}, where C is the set of components 
of G—u. Moreover, if |C| =3 then w(u) > w(G)/4. 


Theorem 1. Algorithm 1 is a pseudo-polynomial approximation with ratio Z for 
MIN-MAX BCP3 which runs in O(w(G)|V||E]|) time on an instance (G, w), where 
G=(V,E). 


Proof. Let P = {Vi, V2, V3} be an ordered 3-partition of G, returned by the 
algorithm; and let G; = G[Vj], for i = 1,2,3. By Lemma 2, P is indeed a 
connected 3-partition of G and it can be computed in time O(w(G)|V||E]). 
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If w(V3) < w(G)/2, then it follows directly from Fact 1 that wt(P) = w(V3) < 
30PT3(G, w). 

Suppose now that w(V3) > w(G)/2. If V3 is a singleton {uw}, then w(u) < 
OPT3(G,w) and P is optimal. Otherwise, the algorithm terminated because 
neither MERGE nor PULL operation can be performed on P. By Lemma 3(ii), 
there exists u € V3 such that u is a cut-vertex of G, {G1,G2} CC, and w(C) < 
w(Vi) < w(V2) for each C € C \ {Gi1,G2}, where C is the set of components 
of G—u. By Lemma 1, we have w*(P) = w(V3) = w(u) + Voce a1,G2} W(C) S$ 
OPT3(G, w). Therefore, in this case the partition P produced by the algorithm 
is an optimal solution for the instance (G,w) of MIN-MAX BCP3. 


In what follows, we show how to extend the result obtained for MIN-MAX BCP3 
to obtain results for MIN-MAX BCP,x, for all k > 4. For simplicity, we say that a 
vertex u satisfying condition (ii) of Lemma 3 is a star-center. Moreover, when u 
is a star-center, we label the € components of G—uasC = {C1,C2,...,Ce}, 
where Cy = G[V2], Ce_-1 = G[Vi] and w(C;) < w(Ci41) for alli € [€—1]. The next 
algorithm uses a routine called GETSINGLETONS which receives as input a con- 
nected graph G = (V, EF), a connected k’-partition P of G, and an integer g > 0 
such that k’ + q < |V], then it produces a connected (k’ + q)-partition of G in 
time O(|V |||) (where q of the classes in the partition are singletons). 


Algorithm 2. MIN-MAx-BCPk  (k > 3) 
Input: An instance (G = (V, E),w) of MIN-MAX BCP,, 3<k<|V| 
Output: A connected k-partition of G 
Routines: MiIn-MAx-BCP3, GETSINGLETONS 


1: procedure MIn-MAx-BCP&K(G, w) 

2: P — Min-MAx-BCP3(G,w) #P = {Mi, V2, V3} 

3: if wt (P) < w(G)/2 or |V3| = 1 then 

4: P’ — GETSINGLETONS(G, w, k — 3,P) 

5: else 

6: Let u be the star-center and let C = {Ci}; jg be the components of G — u. 
re if €>k-—1 then 

8: Let t=€—k+1 and V’ = (Ujeqy V(Ci)) U {u}. 
9: Pp’ << {V', V(Cr41),---;V(Ce_1), V(Ce)} 
10: else 
uh: P< {{u}} U {Crhietg 
12: P’ — GETSINGLETONS(G, w,k — 1—-2£,P) 
13: return P’ 


Theorem 2. For each integer k > 3, Algorithm 2 is a pseudo-polynomial g. 
approximation for the problem MIN-MAX BCP, that runs in O(w(G)|V||E]) time 
on an instance (G,w), where G = (V, E). 

Algorithm 2 is a (polynomial) £ approximation if the weights assigned to 
the vertices are bounded by a polynomial on the order of the graph. When 
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the weights are arbitrary, we apply a scaling technique and use the previous 
algorithm as a subroutine to obtain a polynomial algorithm for MIN-MAX BCP, 
with approximation ratio (£ +), for any fixed € > 0. 


Algorithm 3. ¢-MIN-MAx-BCPk_ (k > 3) 

Input: An instance (G = (V, E),w) of MIN-MAX BCP,, 3<k<|V| 

Output: A connected k-partition of G 

Routine: A pseudo-polynomial a-approximation algorithm A for MIN-MAX BCP, 
: procedure ¢-MIN-MAx-BCPK(G, w) 
6 — maxvev w(v) 
ACK Wy 
for v € V do 

lw) — |] 

P< A(G,@) 


return P 


Theorem 3. Let k > 3 be an integer, and let I = (G = (V,E),w) be an 
instance of MIN-MAX BCP,. If there is a pseudo-polynomial a-approximation 
algorithm A for MIN-MAX BCP, that runs in O(w(G)°|V||E|) time for some 
constant c, then Algorithm 3 is an a(1 + €)-approzimation for MIN-MAX BCP, 
that runs in O(|V|?°T"|E|/e°) time. 


Corollary 1. For each integer k > 3 and <' > 0, there is a (% + &’)- 
approximation for MIN-MAX BCP, that runs in O(|V|3|E|/e’) time on a input 
(G=(V,B),w). 


Proof. The result follows from Theorem 3, by taking Algorithm 3 with ¢ = 
e'/(k/2) and Algorithm 2 as the routine A it requires. The approximation 
ratio k/2 of Algorithm 2 is guaranteed by Theorem 2. 


An algorithm analogous to Algorithm 3 can be designed for MAX-MIN BCPx,. 
In this case, change line 2 to 6 — minyey w(v), change line 5 to w(v) <— 


|e], and consider a routine that is a pseudo-polynomial a-approximation 


for MAX-MIN BCP,. Then, a theorem similar to Theorem 3 can be obtained for 
MAX-MIN BCPx. 


3 Parameterized Algorithm for 1-MAx-MIN BCP 


This section is devoted to the design of a fixed-parameter tractable (FPT) algo- 
rithm for 1-MAX-MIN BCP when parameterized by the vertex cover. In this 
problem, we are given an unweighted graph G, a positive integer k, and a vertex 
cover X of G. The objective is to find a connected k-partition of G that max- 
imizes the size of the smallest class. Let us consider a fixed instance (Gk) of 
1-MAX-MIN BCP and a vertex cover X of G. 
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Let us denote by I the stable set V(G) \ X. Recall that we assume k < 
|\V(G)| = |X| + |Z|. If k > |X], then there are at least k — |X| classes of size 
exactly 1 contained in J, and so an optimal solution (which has value equal 
to 1) can be easily computed. If |X| = 1, then G is a star, and so it is trivial 
to compute an optimal solution. From now on, we assume that k < |X| and 
|X| > 2. 

Before presenting the details of the proposed algorithm, we show a lemma 
that guarantees the existence of an optimal solution in which each class intersects 
the given vertex cover X. 


Lemma 4. Let (G,k) be an instance of 1-MAX-MIN-BCP and let X be a vertex 
cover of G. Then, there exists an optimal connected k-partition {Vi }ietn) of G 
such that Vi X #0 for alli € [k]. 


We remark that the proof of the above lemma follows from the fact that, if 
there exists an optimal solution for 1-MAX-MIN-BCP where a class is contained 
in V(G) \ X, then the cost of such a solution is 1. Therefore, any connected 
k-partition of G is an optimal solution. However, the same observation is not 
valid for 1-MIN-MAX-BCP as one may easily construct a (bipartite) graph G 
and a vertex cover X of G such that every optimal 3-connected partition of G 
has a class which does not intersect X. 

We next use hypergraphs to model the constraints of our ILP formulation 
for 1-MAX-MIN BCP. A hyperpath of length m between two vertices u and v in 
a hypergraph H is a set of hyperedges {e1,...,@m} C E(H) such that u € e1, 
v € €m, and e; Nej41 # OY for each 7 € {1,...,m—1}. A set of hyperedges F C 
E(ff) is a (u, v)-cut if there is no hyperpath between u and v in H — F. 

For each S C X, we define I(S) = {vu € I: N(v) = S}. Let u,v € X bea pair 
of non-adjacent vertices in G, and let Ix (u,v) be the set of all separators of u 
and v in G[X]. Consider a separator Z € I'y (u,v), and denote by C(Z) the set of 
components of GLX \ Z]. Let Hz denote the hypergraph with vertices C(Z) such 
that, for each S C X with I(S') 4 0, there is a hyperedge {C € C(Z): SNV(C) # 
0} in Hz. We denote by Az(u,v) the set of all (C,,,C,)-cuts in Hz, where C, 
and C, are the components of G[X \ Z] containing u and v, respectively. 

Suppose that u and v belong to a same class of a connected partition of G. 
Hence, there exists a path P linking u and v in G. Note that either P intersects Z 
(i.e. V(P)NZ F 0), or P contains a vertex in the stable set I (i.e. V(P)NI £ 9). 
In the latter case, one may easily see that P guarantees the existence of a hyper- 
path Q which connects the vertices C,, and C, in the hypergraph Hz. Therefore, 
the hyperedges of Q must cross every (C,,,C,)-cut in Hz. The hypergraph Hz 
is illustrated in Fig. 1. 

For each v € X and i € [Kk], there is a binary variable x,,; that equals 1 if and 
only if v belongs to the i-th class of the partition. Moreover, for every S C X and 
i € [k], there is an integer variable yg ; that equals the amount of vertices in I(5) 
that are assigned to the i-th class. The intuition behind the y-variables is that all 
vertices in I(S), for a fixed S C X, play essentially the same role in a connected 
partition. The idea of using integer variables to count indistinguishable vertices 
in a stable set appeared before in Fellows et al. [13]. 
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C, 3 ‘ }/ 
(a) Graph G — Z. (b) Hypergraph Hz 


Fig. 1. Illustration of the hypergraph construction. Continuous lines indicate the com- 
ponents in C(Z). Subsets S of X are represented with dashed lines and their corre- 
sponding vertices [(.S') are depicted with squares. In this example, {51,52} and {53} 
are (Cy, Cy)-cuts in Hz. 


Let 7 = 2!*! (number of subsets of X), and let B(G,X,k) be the set of 
vectors in R(*!+”* that satisfy the following inequalities (1)-(7). To shorten 
the description of inequalities (3) in the next ILP, we denote by W the set 


{(uv, Z, F): {u,v} C X,uv ¢ FE, Z € Ix(u,v), and F € Az(u,v)}. 


S- Lyi + S- Ys,i < S- Ly iti t > YSi+1 Vie [k—1], (1) 


vEeX SCX vex SCX 

tigi Vue X, (2) 
i€[k] 

Lui + Lyi — S- Lei — S- Ysi <1 V(uv,Z,F) ei [k], (3) 

zEZ SEF 
ysi <|I(S "a VS C X,i¢€ [k], (4) 
ves 

Y= ys = [I(S) VS CX, (5) 
i€[k] 

Lyi € {0,1} Yue X andieé {k], (6) 
ys,i € Z> VS CX andieé[k]. (7) 


Inequalities (1) establish a non-decreasing ordering of the classes according 
to their sizes. Inequalities (2) and (5) guarantee that every vertex of the graph 
belongs to exactly one class (i.e. the classes define a partition). Due do Lemma 4, 
we may consider only partitions such that each of its classes intersects X. Thus, 
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whenever a vertex in the stable set I is chosen to belong to some class, at least 
one of its neighbors in X has to be in the same class. This explains the meaning 
of inequalities (4). Inequalities (3) guarantee that each class of the partition 
induces a connected subgraph. The following lemma shows that the formulation 
correctly models the problem. 


Lemma 5. Let G be a connected graph, let k > 2 be an integer, and let X be a 
vertex cover of G. The problem 1-MAX-MIN BCP on instance (Gk) is equivalent 
to 


max S- Lyi + S- YSj1: (x, y) € B(G,X, k) 


vex SCX 


An instance of an INTEGER LINEAR PROGRAMMING problem consists of a 
matrix A € Z?*4, a vector b € Z? and a vector c € Z!. The objective is to find a 
vector x € Z4 that satisfies Ax < b, and maximizes c’ x. Let us denote by L the 
size of the binary representation of an instance (A, b,c) of the problem. We next 
present the maximization version of the theorem showed by Cygan et al. [11] 
on the existence of an FPT algorithm for an INTEGER LINEAR PROGRAMMING 
problem parameterized by the number of variables. 


Theorem 4 (Cygan et al. [11]). Let I = (A,b,c) be an instance of an INTE- 
GER LINEAR PROGRAMMING problem with size L and q variables. Then I can 
be solved using O(q?°4+° . (L+log M,) log(M,M-)) arithmetic operations and 
space polynomial in L + log M,, where M, is an upper bound on the absolute 
value a variable can take in a solution, and M, is the largest absolute value of a 
coefficient in the vector c. 


Theorem 5. The problem 1-MAX-MIN BCP, parameterized by the size of a 
vertex cover of the input graph, is fixed-parameter tractable. 


Proof. Let (G,k) be an instance of 1-MAX-MIN BCP, and X a vertex cover 
of G. From Lemma 5, we have that max{)),<¢, ®u,1 + Dogcx Ys: (@Y) © 
B(G,X,k)} is equivalent to solving instance (G,k). Note that the size of the 
corresponding ILP is g2onX? log |V(G)|. By Theorem 4, this ILP can be solved 
in time 22°"*? |V(G)|°. Therefore, 1-MAX-MIN BCP is fixed parameter- 
tractable when parameterized by the size of a vertex cover of the input graph. 


4 Concluding Remarks 


Problems on balanced connected partitions of graphs have been largely investi- 
gated since late seventies. Many variants of these problems, either of existential 
or optimization nature (with different objective functions), have been considered, 
most of them known to be computationally hard. 

One of these intriguing existential problems, in which the input graph is k- 
connected and one is interested in finding a connected k-partition into classes 
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of prescribed sizes, was solved by Lovasz [16] and Gyéri [15]. It has been shown 
that such a partition does exist, but it has not been settled whether it can be 
found in polynomial time. 

The variants we have considered here (MIN-MAX BCP, and MAX-MIN BCP,), 
and the corresponding versions in which the number of classes k is part of the input, 
have gained much attention more recently in terms of approximation algorithms. 
We note that, although for the latter variant an inapproximability threshold (of 
6/5) has been proved in 2007, for the variants in which & is fixed such thresholds 
are not known. The parameterized algorithm shown here seems to be the first of 
this nature for this class of problems. 
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Abstract. In the online car-sharing (a.k.a. ride-sharing) problem, we 
are given a set of m available car, and n requests arrive sequentially in 
T periods, in which each request consists of a pick-up location and a 
drop-off location. In each period, we must immediately and irrevocably 
assign free cars to serve arrived requests, such that two requests share 
one car. The goal is to find an online algorithm to process all requests 
while minimizing the total travel distance of cars. 

We give the first algorithm for this problem under the adversarial 
model and the random arrival model. For the adversarial model, we give 
a 2T + 1/2-competitive algorithm, then we show this can be further 
improved to 2T-competitive by a carefully designed edge cost function. 
This almost matches the known 27 — 1 lower bound in this model. For 
the random arrival model, our algorithm is 3H7 —1/2+0(1)-competitive, 
where Hr is the T-th harmonic number. All the above three results are 
based on one single algorithm that runs in O(n*) time. 


Keywords: Car-sharing - Online matching - Competitive analysis 


1 Introduction 


In a car-sharing system, a company offers cars to customers in which each car 
serves at most two requests (see [17]). A typical scenario is the following: There 
are a number of available cars with current location information, and requests 
with pick-up and drop-off locations arrive over time. The car-sharing company 
has to serve these requests with available cars. The company wishes to minimize 
the total driving distance because it reflects the costs (serving time or fuel con- 
sumption) of serving requests. From a societal point of view, this objective also 
helps to reduce emissions and protect the environment. 


This project has received funding from the European Union’s Horizon 2020 research 
and innovation programme under the Marie Sklodowska-Curie grant agreement number 
754462. 


© Springer Nature Switzerland AG 2022 
N. Balachandran and R. Inkulu (Eds.): CALDAM 2022, LNCS 13179, pp. 224-236, 2022. 
https: //doi.org/10.1007/978-3-030-95018-7_18 


Algorithms for Online Car-Sharing Problem 225 


Formally, the online car-sharing (OCS) problem can be described as follows: 
Given a set C of cars {c, : k = 1,2,...,|C|}, with car k initially at location cx; 
Each customer request i consists of a pick-up location s; and a drop-off location 
t;. The request set R is revealed online in T different time periods; During the d- 
th period, a subset of requests is revealed together as a group, and the algorithm 
must immediately and irrevocably pair them and assign each request pair {i, 7} 
to an unmatched car k. The travel distance of car k serving requests i and 7, is 
the minimum distance of starting from location c;, to visit the locations s;, sj, 
t;, and t;, such that s; is visited before t; and s,; is visited before t;. The goal of 
the online algorithm is to minimize the total travel distance of cars involved in 
the assignment. 


Related Work. The offline car-sharing problem. if T = 1, the online car- 
sharing problem becomes the offline car-sharing problem. All requests in R are 
revealed at once. The objective is to assign all requests to the cars such that 
each car serves exactly two requests while minimizing total travel distance. Bei 
and Zhang [2] first studied this problem and proved that the offline car-sharing 
problem is NP-hard, and they also gave a 2.5-approximation algorithm. Recently, 
Luo and Spieksma [11] gave a new algorithm with an improved approximation 
ratio of 2; For the special case where the pick-up and drop-off location coincides, 
their algorithm achieves 7/5-approximation. Similar problems have also been 
studied in the data mining community under various contexts: one closest is the 
work by [8] where the problem is called” food delivery problem”, with flow time 
objective. 


The Online Minimum Metric Bipartite Matching Problem. Set T = n/2 
in our problem, i.e. request pairs arrive one by one. If the pick-up and drop-off 
locations of any two requests in a pair are the same, then our problem is reduced 
to the online minimum metric bipartite matching problem. For the adversarial 
model, Kalyanasundaram and Pruhs [9] proved that no deterministic algorithm 
can achieve a competitive ratio better than 27’ — 1, which implies a same lower 
bound for our problem in the adversarial model. Furthermore, they gave an 
optimal algorithm, Permutation algorithm, matching this lower bound. In the 
restricted version when all locations are on a line, Koutsoupias and Nanavati [10] 
showed that the work function method has a competitive ratio between O(log T) 
and O(T). Later, Nayyar and Raghvendra [12] gave an input-sensitive analysis 
of the algorithm in [14]. They showed that for any metric space M and car 
locations in S$, the competitive ratio of the algorithm is O(su4(S) log? T) where 
pm(S) is the maximum ratio of the traveling salesman tour and the diameter of 
a subset of cars among all subsets of S. In particular, if S is a set of points on 
a line, then the competitive ratio is O(log” T). For the random arrival model, 
Raghvendra [14] came up with a 2H; —1+ 0(1)-competitive algorithm, matching 
the lower bound. This lower bound also directly implies a 2T’— 1 lower bound 
for our problem in the random arrival model. 


The Dial-a-Ride Problem (DaRP). Let A € Nso be an input parameter, 
if each car can carry up to » requests, and can be reused unlimited times, 
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then the car-sharing problem becomes the classical dial-a-ride problem. The 
DaRP has been studied extensively in both the online and offline setting [1, 
3-7,13]. Most research works focused on the single-vehicle case though: The 
best offline algorithm achieves approximation ratio min{O(./n), O(W)} [4,7], 
while the best online algorithm has competitive ratio 4 (with exponential-time 
computation) [3]. It would be interesting to extend our algorithm to the case 
when each car can serve \ > 2 requests. On the application side, DaRP has been 
studied extensively by the operation research and data mining communities. 
Some recent results are [15,16,18], where they give algorithms solving large- 
scale DaRP with various constraints or objectives. Notably, the algorithm of 
[18] also has approximation guarantee matching the best theoretical results [4,7] 
mentioned before. 


Our Results. We consider two models: the adversarial model and the random 
arrival model. 


— The (adaptive) adversarial model. In this model, at each period d the 
adversary can choose any request subset from the remaining of R to release 
as the arriving group, based on all the decisions the algorithm has made up to 
that period. We use competitive ratio a > 1 to measure the performance of 
our algorithm. Let W(M) be our algorithm’s assignment cost and let W(M*) 
be the minimum-possible cost (in other words, M™ is an optimal offline assign- 
ment). If W(M) < aW(M*) for any C and R and the arrival order in R, 
then we say our algorithm is a-competitive in the adversarial model. 

— The random arrival model. In this model, the adversary got to choose 
the requests for all groups at the very beginning, but the groups will arrive in 
an uniformly random order. If E[W(M)] < aW(M*) for any C and R (with 
expectation taken on the arrival order of R), then we say our algorithm is 
a-competitive in the random arrival model. 


Motivated by the Permutation algorithm [9] and Match-and-Assign algo- 
rithm [2], we give an O(n3)-time algorithm named Online-Match-and-Assign 
(OMA) that works in both the adversarial model and the random arrival model. 
The algorithm essentially computes two matchings: it first pairs arrived requests 
by computing a minimum-cost matching on them, then assigns these request 
pairs to cars by computing another minimum-cost matching. By carefully choos- 
ing the cost function used when computing the two matchings, we obtain the 
following results: 


1. For the adversarial model, we first show that the OMA algorithm achieves 
2T + 1/2-competitiveness with a natural cost function defined by the met- 
ric. When T = 1, the result matches the approximation ratio 2.5 of the 
offline car-sharing problem [2]. We then tweak the cost function and show 
that OMA achieves an improved ratio of 2T. Note that this also implies a 
2-approximation algorithm for the offline car-sharing problem, matching the 
best approximation ratio so far [11]. 
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2. For the random arrival model. We show the OMA algorithm is 3H — 1/2 + 
o(1)-competitive (Hr is the Tth harmonic number), while the tightest known 
lower bound is 2H — 1 — o(1) [14]. 


Lastly, we point out that our algorithm can be easily adapted to allow exclu- 
sive requests (e.g. customers that don’t want to share ride with others): we can 
create a virtual request co-locating with each exclusive request, and pair them 
together. Then the above competitive ratios still hold. 


2 Preliminaries 


2.1 Problem Setting and Notations 


We now define the Car-Sharing Problem formally. We are given a metric space 
(4X, dist). There is a set C of m cars numbered from 1 to m, each of capacity 2, 
and the k-th car is identified by its initial location cy, € ¥. There is also a set R 
of n rider requests. The i-th request is denoted as a tuple (s;,t;) € X x Y, where 
5; is the pick-up location and t; is the drop-off location of the rider. R is further 
partitioned into T groups G,,...,Gr, where each group Ga(d € [T]) contains 
requests arrived in a same short time period and should be handled separately 
from those in other groups. We shall use Rg = Vie tay G; to denote the set of all 
requests seen till period d. 

Now we describe the desired output for the Car-Sharing Problem. Without 
loss of generality, we can assume in each group there are an even number of 
requests. We want to assign cars to requests in a way that minimizes the number 
of cars used as well as the total distance all cars travel. Specifically, a solution 
for the Car-Sharing problem can be represented as two matchings (MR, M): 
MR = Wa. IT] Qa C Rx Ris a perfect matching over all requests R, and it is 
also the disjoint union of Qa, d € [T], where Qg € Gg x Gq is a perfect matching 
on Gg. MR represents a pairing of requests: every request pair (7,7) € MR will 
share a car. The second matching M € C x MR represents an assignment from 
cars to paired requests. We can w.].o.g. assume all pairs in MR are matched, ie., 
every request pair has its exclusive car. 

We can now define the cost of a solution. Let (a,a2,...,aK) € X* be 
any sequence of locations in ¥, we use dist(a,,a2,...,aK) := dist(a,,a2) + 
dist(az,a3) + +--+ dist(ax_1,aK) to denote the total travel distance of mov- 
ing a, to ax and visit each a; in order. Then suppose requests 7, 7 are assigned 
to car k, the travel distance w(k, {7,7}) for car k to serve request pair {i,j} is 
defined as 


wh, 11, 9)) = main dist(o,7 85, t;,.83, 4, ), dist ce, Sz, 84; tit), 
dist(cx, 8;, 8;,t;,t;), dist(cx, 3;,t;, 31, ti), 
dist(cx, $j, 8;,t;, ti), dist(cx, 3;, 8;, ti, tj) }. (1) 
Now the cost of a solution (MR, M) can be denoted as 


W(M)= Do w(k, {é,5}), (2) 


(k, {i,j })EM 
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and the goal for the Car-Sharing Problem is to find a solution with the minimum 

In the Online Car-Sharing (OCS) problem, the set C of cars is known a 
priori, while the request set R is revealed sequentially in T periods: in the d-th 
period the group Gy arrives, and we need to assign some cars to serve this group 
immediately and irrevocably. We will also use OCS,—; to denote the special 
version when every request only has one location, i.e. t; = s; for every request 2. 


Summary of notations. We use notation vz; := dist(c,,s;) to denote the 
distance between car k and request 7. We define u;; to be the shortest path 
length that starts at s; and serves request pair {7,7}: 


Ui = min{dist(s;, tis Sj, tj), dist(s;, 85, bi, tj), dist(s;, Sj, t5, t;)}. (3) 


Similarly, w;; is the shortest length of such paths starting at s;. Usually uij A uji, 
but in the special case OCS,—; we always have u;; = uj; = dist(s;, 5;). 


Proposition 1. For u,v,w defined as above, we have 


la w(k, {4,7}) = min{vug; + Uiz, Vk,j + Uzi} 
1b For any two requests i and j, we have 


max{u;;,us;} —min{u;;,u,;} < dist(s;, s;) 
1c For any assignment (k, {i,7}), we have 
min{v,i, Ve,j}+min{usy, uj} < wk, {t, 7}) < max{ves, Veg} +min{ui, uji} 


1d For any assignment (k, {i,7j}), w.Lo.g, suppose wij > Uji, we have 


w(k, {i, j}) = min {ons Ti ae Ui 3 they ; ul uid 


Table 1 gives an overview of important notations used in this paper. 


Table 1. Overview of important notations 


Notation Definition 

C,|C| =m Set of cars 

Ck Initial location of car k € C 

R,|R| =n Set of requests 

(si, ta) Pick-up location and drop-off location for the ith request 
T The total number of periods 

Ra The set of all requests arrived till period d € [T] 

MR Pairing of requests 

M Assignment from C to MR 

w(k, {i, j}) The min travel distance of serving request 7,7 using car k 
W(M) The sum of travel distances W(M) = te {eget w(k, {i, 7}) 
Ve i Up,i = dist(c,,s;) fork e€ Candie R 

ui (resp. uj;) | Length of shortest path that serves request i, 7, see (3) 
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2.2 7-Net-Cost Augmenting Path 


One building block for our algorithm is the 7-net-augmenting path proposed by 
Raghvendra [14]. Consider a complete bipartite graph G with non-negative edge 
costs w, and let F' be a matching on this graph. An alternating path (or cycle) 
is a simple path (resp. cycle) with edges alternating between those in F’ and 
outside F’. Suppose there are some vertices unmatched by F' (which we call free 
vertices). An augmenting path P is an alternating path starting from one free 
vertex and ending at another free vertex. Given any such P, we can enlarge F 
by letting F = F' @ P (the symmetric difference of F' and P). In the minimum 
cost matching problem, we want to match more vertices while having less edge 
cost. To characterize the cost incurred by augmenting F' with P, we define the 


T-net-cost as: 
®,(F,P,w)=r S> wle)— S> w/e), (4) 


e€P\F e€PNF 


where 7 > lisa parameter. When 7 = 1, this is just the net cost of augmenting F 
with P, and the well-known Hungarian algorithm augments F' by implicitly com- 
puting the path with minimum 1-net-cost. For 7 > 1, Raghvendra [14] showed 
that one can still find the augmenting path with minimum 7-net-cost efficiently. 
We generalize [14]’s idea to finding a set of augmenting paths with small total 
T-net-cost: let P be a set of augmenting paths and define the 7-net-cost of this 
set to be 

(F,P,w) = >_ &,(F, P,w) (5) 

PEP 


In Sect. 3 we give an algorithm to find an augmenting path set P with minimum 
@,, and use it to guide the search for a good assignment. 


3 The Online-Match-and-Assign Algorithm 


In this section, we describe our main algorithm for the online car-sharing prob- 
lem: Online Match-and-Assign algorithm (OMA). OMA consists of two steps in 
each period: in the first step, the algorithm pairs the requests, and in the second 
step, run the min-tau-net-cost algorithm which assigns the request pairs to the 
cars. Both of the two steps are essentially computing minimum-cost matchings 
w.r.t. different edge cost functions. The algorithm is summarized in Algorithm 1. 
We give a brief explanation here. 

The OMA algorithm maintains two matchings at every period d: an offline 
matching M/’, serving as an approximation for the optimal, and the actual online 
matching Mg. The algorithm updates these two matchings iteratively. Recall Rg 
is the set of all requests present till period d. In the first step, the algorithm 
computes a minimum-cost pairing Iz on newly arrived requests Gq w.r.t. the 
following edge cost v1 : Gg x Gat? Rso: 


01 ({t,9}) = min{u,;, u4¢} (6) 
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where u;; is defined in (3). The cost vi({i,j}) can be viewed as the short- 
est travel distance needed to serve request pair {i,j}. Let MRqa be the pairing 
on Rg after this step, we see that MRg = MRg_; UIy. We use v,(MRqg) := 
VijyemrR, Ui({t,J}) to denote the total cost of edges in pairing MRq. We also 
have the following simple proposition on the pairing cost: 


Proposition 2. For any feasible pairing MR% until period d, 
vi(MRa) < UL (MR;7). 


After all newly arrived requests are paired up with each other, we run algo- 
rithm min-tau-net-cost,,,, (see Algorithm 2) with parameter 7 and edge cost 
function v2 to assign cars to the new request pairs Ig. Here 7 is the coefficient 
in 7-net-cost ,(M,P, v2), and v2 € RSX (Rx) is a cost measure for a car k to 
serve a request pair {i, 7}. Specifically, for different arrival models we will choose 
different ve € {a, 3,~} where: 


a(k, {é, j}) = min{ve i, 04,9} (7) 


B(k, {, j}) = min {ons et > he ws se} (assuming ujj > Uzi) 
(8) 
Vk, {i, JF) = max{vp,i, Ve,j} (9) 


Among the three edge cost functions above, a can be understood as the distance 
of edge between a car k& and a request pair {i,j}, while 3 can be thought as a 
compensated by the travel distance needed to serve i and 7. We’ll use v2 = a or 
G for the adversarial arrival model, and v2 = y for the random arrival model. 


Algorithm 1. Online match and assign algorithm (OMA, ,,). 
Parameters: T > 1, v2 € {a, 3,7} 


1: MRo + 0 > request pairing 
2: Mo + 0 > the online matching 
3: Mo 0 > the offline matching 
4: for requests that arrive in period d do 

5: Ig — minimum-cost matching on Rg \ Ra-1 w.r.t. cost v4 

6: MRg — MRg_i Ula > current request pairing 
7: (Mi, Ma) — min-tau-net-cost;,1.(C UMRa, Mj_,, Ma-1) > See Algorithm 2 
8: end for 

9 


: return MR = MRr, M’ = My, M = Mr 


The min-tau-net-cost algorithm is similar to the online minimum metric bipar- 
tite matching algorithm in [9,14]. The main difference here is that we need to 
compute multiple augmenting paths simultaneously while in [9,14] they only 
need one path at a time. In period d, we compute the minimum-rt-net-cost aug- 
menting path set P w.r.t. the offline matching M/_,, and update the offline 
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Algorithm 2. min-tau-net-cost,,,.(C UMRa,M}_,, Ma-1). 
Parameters: T > 1, v2 € {a, 8,7} 


1: Compute the augmenting paths P with min @,(Mj_,,P, v2) 

2: Update the offline matching with P: Mi — Mj_, OP. 

3: Assign each request pair {7,7} in MRa \ MRg-i to its corresponding endpoint 
(free car) k along the augmenting path P € P starting from {i,j}. Let Tq be the 
assignment of request pairs in MRa \ MRqa-1. 

4: Update the online matching: My — Mg_1U Ya. 

5: return M/ and Mz 


matching M — M}_,@P; Then we assign the new request pairs using P: each 
{i,7} € MRg \ MRg_1 is assigned to its corresponding endpoint (free car) k 
along the augmenting path P € P starting from {7,7}. Raghvendra [14] gave a 
polynomial-time algorithm for the |P| = 1 case, and we generalize it to get the 
following theorem: 


Theorem 1. For every d € [T] and any tT > 1, edge cost vg, Step 1 of Algo- 
rithm 2 can be computed in O(n®) time to get a set P of augmenting paths such 
that 


ta |P| =|MRa\ MRz¢-1| and all paths in P are (node-)disjoint with each other. 
1b Among all augmenting path sets that satisfy condition (1a), P is the one with 
minimum ®,(M,P, v2). 


We note the main technical difficulty of the car-sharing problem: unlike in 
the metric bipartite matching problem [9,14] where the cost of assigning a car 
to a request is naturally defined by the distance between them, in our problem, 
the “distance” from a car to a request pair is not well-defined; That’s one reason 
why our problem is inherently harder than matching: recall even the offline car- 
sharing is already NP-hard. Although we have designed some edge cost functions 
(e.g., (7) (8) (9)), they don’t necessarily satisfy the triangle inequality, which 
poses additional difficulty in the analysis. 

In the next section, we analyze the algorithm performance with different 7 
and vp. For the adversarial arrivals model in Sect. 4.1, we use rT = 1, ve € {a, 3} 
and the whole algorithm is referred to as OMA1,,. or OMA, g, respectively; while 
for the random arrival model in Sect. 4.2, we use some T > 1 and vg = y, with 
the algorithm referred to as OMA,,y. 


4 Algorithm Analysis 


In this section, we analyze the performance of the OMA algorithm in the adver- 
sarial model and the random arrival model. 
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4.1 Adversarial Order of Arrivals 


In this section, we will present a detailed analysis of the (2T + 1/2)-competitive 
algorithm OMA, .. Using in place of a we can slightly improve the competitive 
ratio to 27. The analysis of OMA; is overall identical but technically more 
involved, so to keep a better flow of presentation we leave it to the supplementary. 

In OMA,.., we run the OMA algorithm with v2 = a (see Algorithm 1): in 
each period d, we compute the augmenting paths P with min#@,(M/,_,,P,a), 
then update the offline matching to M) = P © Mj_,. First, we claim 
that MM’, is a minimum perfect matching in the weighted bipartite graph 
G = (CU MRg, ve(k, {t, 7})): because an alternating path set P minimizing 
&,(M!_,,P,v2) also minimizes ©)(M/_,,P,v2) + (kes EM_, va(k, {i, j}), 
which is exactly oe fig eM! vo(k, {i,7}). Therefore, we have the following 
proposition. 


Proposition 3. For any edge set M, let v2(M) := Ya cijyyem v2(k tt, J}) 
denote the total v2 for M. Let M* be an optimal assignment for request pairing 
MR. with respect to cost function vg € {a, 3}. Then for any period d > 1, we 
have : 

v2(M%4) < v2(M") = v9(M*). 


Recall now the cost a for car k to serve request pair {7,7} is defined to be 
a(k, {i, j}) = min{ug;, vp; }. Although a doesn’t satisfy the triangle inequality, 
the following lemma shows that a can still be bounded by alternating path length 
plus the pairing cost. 


Lemma 1. Let a(M) := Va gijyyen Uk {i,5}) for any matching M. In 
period d, we have 


a(Ma\ Ma-1) < o(Mj_,)+0(Mi)+ > dist(si,s;). 
{i,j}€MRa-1 


Let M* be an optimal assignment and MR* be the corresponding pairing. 
The following lemma shows that the pairing MR used by our algorithm actually 
induces a good matching when using a as the edge cost function. 


Lemma 2. (Lemma 4 in [2]) Let M* be the minimum perfect matching in 
the weighted bipartite graph G = (CU MR, a(k, {t,7})) where a(k, {i,7}) = 
min{vg;,vk,j}. We have: 
a(M*) < Ubi T Ub S| 
(k,{i,j})eM* 


Now we can prove the competitive-ratio for OMA, a. 


Theorem 2. For OCS in the adversarial model, OMA,,. algorithm is (2T + 
1/2)-competitive. 
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Proof. According to definition (2), we have: 


WiM)= So w(lkfij})< S> (maxf{og,vej} + min{usj, ujs}) 
(k {i,j })EM (k {i,j })EM 
< S- (min{ vei, Vk, } + dist(s;, s;) + min{ujz;, uji}) 
(k {i,j })EM 
< S- S- (min{v,,i, Ur, 5} + dist(s;, 8;)) 
d€[T] (k,{i,9})€Ma\Ma-1 
+ S- min{wi;, Uji}. 
{i,j}EMR 
The first inequality is by Proposition 1c and the second one is by triangle inequal- 
ity. Then, by Lemma 1 and the definition of a, we have, 


min{v,,i,Vz,j} < a(My_,) + (My) + aE dist(.s;,84), 
(k,{4,5})€Ma\Ma-1 {i,j}€MRa-1 


Recall that v1 ({i,j}) := min{uij, uj} and vi(MR) := D7 4; emer U({t, Jf), thus 


W(M) < S- a(M}_,) +a(M}4) + S- dist(s;,s;) | + v1(MR). 
de[T] {i,j }€MRa 


By Proposition 3 we have a(M/,) < a(M’) = a(M*); Furthermore, by defini- 
tion for all 1 < d < T there is )7 4; jy eur dist(si, 83) 2 Uys jem, dist(si, 85). 
Combine the two we get 


W(M) <(2T-1)a(M*)+T S°  dist(s;,s;) + v(MR) 
{i,j}EMR 
<(2T — 1) a cL +T ay dist(s;, $;) + v1(MR) 
pa: Z D) 7” 7 ae 
(k, {i,j })eM* {i,j}EMR 
<(2T-1) S>  (min{vy,:,0%,5} + dist(s;, s;)/2) 
(k, {i,j })eEM* 
+T S- dist(s;, $7) + u1(MR) 
{i,j}EMR 
<2T-1) So min{vgs, veg} + ((2T — 1)/2+T + 101 (MR*) 
(k, {i,j })eM* 
<(T +1/2) Wt"). 


The second inequality follows from Lemma 2. The third inequality follows from 
the triangle inequality. The fourth inequality holds because min{uj;;,u;;} > 
dist(s;,5;) and v;(MR*) > v1(MR) > 304; emer dist(si, sj) (by Proposition 2 
and triangle inequality). The last inequality follows from Proposition 1c. 
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Replacing the edge cost function a with 6 (see (8)), we can remove the 
additive 1/2 to get a 2T-competitive algorithm, matching the best known offline 
(T = 1) approximation ratio. The analysis of OMA, g is quite similar to OMA, 4. 


Theorem 3. For OCS in the adversarial model, the OMAj,g algorithm is 2T- 
competitive. 


For the special case where each request’s pick-up location s; coincides with 
its drop-off location t;, we can show a slightly tighter ratio. 


Theorem 4. For OCS,—; in the adversarial model, OMA, . algorithm is (2T — 
1/2)-competitive. 


Kalyanasundaram and Prushs [9] proved that, for the minimum metric bipartite 
matching problem no deterministic algorithm can achieve a competitive ratio 
smaller than 27’ — 1. This directly implies a 27’ — 1 lower bound for OCS and 
OCS,—:, because online minimum metric bipartite matching problem is a special 
case of OCS,_;. Also note that when T = 1, OCS becomes the offline car-sharing 
problem, and we get a competitive ratio of 2, which is also the best-known 
approximation ratio so far [11]. 


4.2 Random Order of Arrivals 


In this section, we analyze the performance of OMA,,, for OCS in the random 
arrival model. As in OMA; ., we augment M/,_, by finding a set of augmenting 
paths P, and let M) = P 6 M}_,. The main differences from the adversarial 
model are: (1) we use a different edge cost function v2(k, {4,7}) = y(k, {t,7}) = 
max{vgi,vz,j}, and (2) Mi (d > 2) is not necessarily the minimum weight 
perfect matching in the weighted graph G = (CU MRa, 9(k, {i, 7})). We first 
bound the increased cost by the augmenting path length. 


Lemma 3. Let y(M) := i(k. fi,g EM 9(k, {i,7}) for any assignment M. In 
period d, let P be the alternating paths with respect to M/_,, we have 


(Ma \ Ma-1) < VP \ Ma-1) + (P90 M7_1). 
We also have the following lemma from [14] that relates the matching cost 
with the total 7-net-cost of augmenting paths produced over time: 


Lemma 4. (Lemma 7. (ii) in [14]) Let 7 > 1. Let Pi, Po,....,Pr be the aug- 
menting path sets computed by our algorithm in that order. Then, the T-net-cost 
of these paths relates to the cost of the online matching as follows: 


z T+1 T-1 
do o-(Pa) = —(M') 4 y(M). 


a 
Il 
any 


Now we prove the competitive ratio for OMA,;_,. 
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Theorem 5. In the random arrival model, the OMA,,, algorithm is 3H7—1/2+ 
o(1)-competitive. 


There also exists a 2H, — 1 — o(1) lower bound for the minimum metric 
bipartite matching problem in the random arrival model [14]. Like in the case 
of adversarial model, this implies a same 2H7 — 1 — 0(1) lower bound for OCS 
and OCS,-; in the random arrival model. 


5 Conclusion 


We gave the first algorithm for the online car-sharing problem in the adaptive 
adversarial model and the random arrival model. Our algorithm achieves near- 
optimal competitive ratio in both models. One immediate open problem is to 
allow each car to serve \ > 2 requests. It’s natural to think along the same 
approach of this paper: i.e., first “cluster” the requests according to some crite- 
ria, then assign cars to the resulted clusters by solving certain min-cost match- 
ing. However, the competitive ratio will likely depend on X: one feature that 
makes our problem hard is the rigid requirement that every car serves exactly 
requests, and this often implies solving hard problems like \-dimensional match- 
ing. Another direction worth exploration is to consider different objectives, e.g., 
customer waiting time (a.k.a. flow time), which has apparent practical impor- 
tance. 
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Abstract. Given (a1,...,@n,t) € i Naa the Subset Sum problem 
(SSUM) is to decide whether there exists S C [n] such that 0-5 ai = t. 
Bellman (1957) gave a pseudopolynomial time dynamic programming 
algorithm which solves the Subset Sum in O(nt) time and O(¢t) space. 

In this work, we present search algorithms for variants of the Sub- 
set Sum problem. Our algorithms are parameterized by k, which is a 
given upper bound on the number of realisable sets (i.e. number of solu- 
tions, summing exactly t). We show that SSUM with a unique solution 
is already NP-hard, under randomized reduction. This makes the regime 
of parametrized algorithms, in terms of k, very interesting. 

Subsequently, we present an O(k-(n+t)) time deterministic algorithm, 
which finds the hamming weight of all the realisable sets for a subset sum 
instance. We also give a poly(Ant)-time and O(log(knt))-space determin- 
istic algorithm that finds all the realisable sets for a subset sum instance. 
Our algorithms use analytic and number-theoretic techniques. 


Keywords: Subset sum - Power series - Isolation lemma - Hamming 
weight - Interpolation - Logspace - Newton’s identities 


1 Introduction: Variants of Subset Sum 


The Subset Sum problem (SSUM) is a well-known NP-complete problem [1, 
p. 226], where given (a1,...,@n,t) € y Ae the problem is to decide whether 
there exists SC [n] such that }>,., a; = t. In the recent years, provable-secure 
cryptosystems based on SSUM such as private-key encryption schemes [2], tag- 
based encryption schemes [3], etc. have been proposed. There are numerous 
improvements made in the algorithms that solve the SSUM problem in both 
the classical [4-8] and quantum world [9-11]. One of the first algorithms was 
due to Bellman [12] who gave a O(nt) time (pseudo-polynomial time) algorithm 
which requires §2(t) space. One can ask for a search version of this problem, 
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i.e. to output all the solutions. Since there can be exponentially many solutions, 
it could take exp(n)-time (and space), to output them. This motivates our first 
problem defined below. 


Problem 1 (k—SSSUM). Given (a1,...,@n,t) € Z&¢", the k-solution SSUM(k— 
SSSUM) problem asks to output all S C [n] such that $7,.¢ a; = t provided with 
the guarantee that the number of such subsets is at most k. 


> Remark. We denote 1 — SSSUM as unique Subset Sum problem (uSSSUM). 
In stackexchange, a more restricted version was asked where it was assumed that 
k = 1, for any realizable t. Here we just want k = 1 for some fixed target value 
t and we do not assume anything for any other value t’. 

Now, we consider a different restricted version of the k — SSSUM, where 
we demand to output only the hamming weights of the k-solutions (we call it 
Hamming — & — SSSUM, for definition see Problem 2). By hamming weight of 
a solution, we mean the number of a,’s in the solution set (which sums up to 
exactly t). In other words, if @- 0 = t, where d@ = (ai,...,@,) and v € {0,1}”, 
we want |v|1, the ¢;-norm of the solution vector. 


Problem 2 (Hamming — k — SSSUM). Given an instance of the k — SSSUM, 
say (a1,..-,4n,t) € Z%h', with the promise that there are at most k-many 
S C [n] such that >),-4 a; = t, Hamming — k — SSSUM asks to output all the 
hamming weights (i.e., |.S'|) of the solutions. 


It is obvious that solving k -SSSUM solves Problem 2. Importantly, the deci- 
sion problem, namely the HWSSUM is already NP-hard. The HWSSUM problem 
is : given an instance (a1,...,@n,t,w) € i. decide whether there is a solu- 
tion to the Subset Sum with hamming weight equal to w. Note that, there is a 
trivial Cook’s reduction from the SSUM to the HWSSUM: SSUM decides ‘yes’ to 
the instance (a1,...,@n,t) iff at least one of the following HWSSUM instances 
(a1,---,@n,t,2), for 7 € [n] decides ‘yes’. Therefore, the search-version of HWS- 
SUM, the Hamming — & — SSSUM problem, is already an interesting problem and 
worth investigating. 

In this work, we give various deterministic algorithms for Problem 1-2. Our 
algorithms are algebraic and number theoretic in nature and mainly build upon 
the previous power series techniques, by Jin and Wu [6] and sparse interpola- 
tion [13]. 


1.1 Main Results 


In this section, we briefly state our main results. The leitmotif of this paper is 
to give efficient algorithms for variants of SSUM, with a promise of a bounded 
number of solutions. Our first theorem gives an efficient pseudo-linear O(n + t) 
time deterministic algorithm for Problem 2, for constant k. 


Theorem 1 (Algorithm for hamming weight). There is a O(k(n+t))-time 
deterministic algorithm for Hamming — k — SSSUM. 
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> Remark (Optimality). We emphasize the fact that Theorem 1 is likely to be 
near-optimal for bounded k, due to the following argument. An O(t'~*) time 
algorithm for Hamming — 1—SSSUM can be directly used to solve 1— SSSUM, as 
discussed above. By using the randomized reduction (Theorem 3), this would give 
us a randomized n°()¢!—«time algorithm for SSUM. But, in [14] the authors 
showed that SSUM does not have n?“¢!~€ time algorithm unless the Strong 
Exponential Time Hypothesis (SETH) is false. 


Theorem 2 (Algorithms for finding solutions in low space). There is a 
poly(knt)-time and O(log(knt))-space deterministic algorithm which solves k — 
SSSUM. 


pm Remark. When considering low space algorithms outputting multiple values, 
the standard assumption is that the output is written onto a one-way tape which 
does not count into the space complexity; so an algorithm outputting knlogn 
bits (like in the above case) could use much less working memory than kn log n; 
for a reference see McKay and Williams [15]. 


> Comparison with the Trivial Algorithm. Consider the usual search-to-decision 
reduction for subset sum: First try to include a; in the subset, and if it is 
feasible then we subtract t by a, and add a, into the solution, and then 
continue with a2, and so on. This procedure finds a single solution, but if 
we implement it in a recursive way then it can find all the k solutions in 
k-n- (time complexity for decision version) time; we can think about an n-level 
binary recursion tree where all the infeasible subtrees are pruned. 


Theorem 1 Is Better than the Trivial. Since number of solutions is bounded by k, 
choosing a prime p > n+t+k suffices in [6], to make the algorithm deterministic. 
Thus, the time complexity of the decision version is O((n +t) log k). Hence, from 
the above, the search complexity is O(kn(n+t)) which is worse than Theorem 1. 


Theorem 2 Is Better than the Trivial. For solving the decision problem in low 
space, we simply use Kane’s O(log(nt))-space poly(nt)-time algorithm [16]. As 
explained (and improved) in [7], the time complexity is actually O(n3t) and the 
extra space usage is O(n) for remembering the recursion stack. Thus the total 
time complexity is O(kn*t) and it takes O(n) + O(log t) space. While Theorem 
2 takes O(log(knt)) space and poly(Ant) time. Although our time complexity is 
worse!, when k < 20(("l08#)'~) | for € > 0, our space complexity is better. 


1.2 Technical Overview 


All the algorithms presented in this paper consider that the number of solutions 
is bounded by a parameter k. This naturally raises the question whether the 
SSUM problem is hard, even when the number of solutions is bounded. We will 
show that this is true even for the case when k = 1, i.e., USSSUM is NP-hard 
under randomized reduction. 


' Thm. 2 is not about time complexity; as long as it is pseudopolynomial time it’s ok. 
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Theorem 3 (Hardness of uSSSUM). There exists a randomized reduction 
which takes a SSUM instance M = (a4,...,Qn,t) € hag as an input, and 
produces multiple SSUM instances SS¢ = (b1,...,bn,t), where ¢ € [2n?], such 
that if 


—~ M is a YES instance of SSUM = > Fé such that SS: is a YES instance of 
uSSSUM; 
—~ M is a NO instance of SSUM = > V£,SS¢ is a NO instance of uSSSUM. 


Proof. The core of the proof is based on the Lemma 1 (Isolation lemma). The 
reduction is as follows. Let w1,...,w, be chosen uniformly at random from 
[2n]. We define b; = 4n?a; + w;,Vi € [n] and the £“" SSUM instance as SS, = 
(by,...,bn,t = 4n?t +2). Observe that all the new instances are different only 
in the target values t. 

Suppose M is a YES instance, i.e., 4S C [n] such that }0,-g a; = t. Then, 
for £= Do jicg Wi, the SS; is a YES instance, because 


S00) — #0 = An? (S«-4) = (*-Em) =0. 


ieS ies ieS 


If M is a NO instance, consider any @ and S C [n]. Since M is a NO instance, 
4n?(>o 9 4: — t) is a non-zero multiple of 4n?, whereas |¢— Yo,.¢ wil < 4n?, 
which implies that 


A4n?(S a; —t)-—(€—) w;) £0 => sae. 


ies ieS ies 


Hence, SS, is also a NO instance. 

We now show that if M is a YES instance, then one of SS, is a uSSSUM. 
Let F contain all the solutions to the SSUM instance M, ie. F = {S|S C 
[n], vices ai = t}. Since w;’s are chosen uniformly at random, Lemma 1 says 
that there exists a unique S € F, such that w(S) = d0,-g wi, is minimal with 
probability at least 1/2. Let us denote this minimal value w(S) as ¢*. Then, 
SS, is uUSSSUM because Sis the only subset such that }0,.g wi = £*. 


Proof idea of Theorem 1. First we sketch the idea for k = 1. Suppose, we 
have a uSSSUM instance such that the hamming weight of the unique solution 
is w. Choose a prime q = O(n+k+t) and a primitive root pL, i.e. ordg(u) = q—1 
(for definition, see Definition 2). We can find them efficiently in O(n + k + t) 
time. 

Now, consider the following important polynomial f(x) = [[j_,(1+p-2%). 
Observe that the coefficient of x* in f is uy’. Therefore, by using Lemma 6, we 
can find n” from f(x) and extract w, since ordg(uw) = q—1>n > w. This solves 
Hamming — 1 — SSSUM. 

This idea can be extended to general k-SSSUM instance. Observe that, we 
cannot directly use the above trick, for a single polynomial f(x), since, in this 
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case, the coefficient of x! is 30,-, Ai: w’’, where w; are the hamming weights of 
the solution, which occur \; times. Eventually, we want to create a polynomial 
whose roots are of the form pu”, so that we can first find the roots uw“ (over 
F,), and from them we can find w;. To achieve that, we work with k-many 
polynomials f; := [];_,(1 + #? - 2%), for 7 € [k]. Note that the coefficient of 2! 
in f; is of the form 57,2; 44-7”? (Claim 2). By Newton’s Identities (Lemma 3) 
and Vieta’s formulas (Lemma 4), we can now efficiently construct a polynomial 
whose roots are ju’. For details, see Sect. 3. 


Proof idea of Theorem 2. The above polynomial method fails to give a low 
space algorithm, since Lemma 6 requires (2(t) space (eventually it needs to store 
all the coefficients mod x**+). Therefore, our proof idea of Theorem 2 is com- 
pletely different from that of Theorem 1. Here, we work with a multivariate poly- 
nomial f(z, y1,---,Yn) = []j_,(1 + we) over F,, for a large prime g = O(nt) 
and its multiple evaluations f(a,ci,...,¢n), where (a,¢1,...,Cn) € ner 

Observe that, the coefficient of 2’ in f is a multivariate polynomial 
pe(Y1,--+,Yn); each of its monomial carries the necessary information of a solu- 
tion, for the instance (a1,...,@n,t). More precisely, S is a realisable set of 
(a1,---,@n,t) <= + [Jeg 4% is a monomial in p,. And, the sparsity (number 
of monomials) of p; is at most k. 

Therefore, it boils down to find the multivariate polynomial p;. How easy 
it is to find p,? Note that we cannot expect to find p;, just by trivial multipli- 
cation as it would take O(2"t) time! Instead, our algorithm is a reconstruction 
algorithm, which efficiently reconstructs p:, from multiple evaluations points 


f(@,¢1,--+,€n), for a € Fj. Eventually, we will use sparse interpolation [13] 
(see Theorem 6), which requires evaluations of the polynomial p;(yi,..., Yn) at 
multiple (polynomially many) points (c1,...,¢n) € Fj. To find p:(ci,...,Cn), we 
use Kane’s identity (Lemma 2) which uses the evaluations f(a,ci,...,¢n), for 


a € [l,q—1]. Finding p;(ci,..., cn) can be efficiently done in logspace. The rest 
(to reconstruct p;) requires a brief space complexity analysis of [13]. For details, 
refer to Sect. 4. 


1.3. Prior Works and Their Limitations 


Before going into the details, we briefly review the state of the art of the prob- 
lems (& its variants). After Bellman’s O(nt) dynamic solution [12], Pisinger [17] 
first improved it to O(nt/logt) on word-RAM models. Recently, Koiliaris and 
Xu gave a deterministic algorithm [18,19] in time O(./nt), which is the best 
deterministic algorithm so far. Bringmann [5] & Jin and Wu [6] later improved 
the running time to randomized O(n +t). All these algorithms require (Q(t) 
space. Moreover, most of the recent algorithms solve the decision versions. Here 
we remark that Abboud et al. [14] recently showed that SSUM has no t!~«n?) 
time algorithm for any e > 0, unless the Strong Exponential Time Hypothesis 
(SETH) is false. Therefore, the O(n +t) time bound is likely to be near-optimal. 

In [18] (also see [19, Lemma 2]), the authors gave a deterministic O(nt) algo- 
rithm that finds all the hamming weights for all realisable targets less than equal 
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to t. Their algorithm does not depend on the number of solutions for a particular 
target. Compared to this, our Theorem 1 is faster when k = o(n/(logn)°), for 
a large constant c. Similarly, with the ‘extra’ information of k, we give a faster 
deterministic algorithm (which even outputs all the hamming weights of the 
solutions) compared to O(,/nt) decision algorithm in [18,19] (which outputs all 
the realisable subset sums < t), when k = o(,/n/(log n)°), for a large constant c. 
Here we remark that the O(nt)-time dynamic programming algorithm [12] can 
be easily modified to find all the solutions, but this gives an O(n(k + t))-time 
(and space) algorithm solution. 

On the other hand, there have been quite some work on solving SSUM in 
LOGSPACE. Lokshtanov and Nederlof [20], and Kane [16] (2010) gave O(log nt) 
space poly(nt)-time deterministic algorithm, which have been very recently 
improved to O(n?t)-time and poly log(nt) space. On the other hand, Bringmann 
[5] gave a nt!*© time, O(nlogt) space randomized algorithm, which have been 
improved to O(log n log log n + log t) space in [7]. Again, most of the algorithms 
are decision algorithms and do not output the solution set. In contrast to this, 
our algorithm in Theorem 2 uses only O(log(knt)) space and outputs all the 
solution sets, which is near-optimal. 

Finally, we remark that in the proof of Theorem 1, we extend analytic tools 
from [6] to our advantage (see Lemma 6), yet our algorithm for Theorem 1 is 
deterministic (unlike in [6]). 


2 Preliminaries and Notations 


Notations. Z and Q denotes the set of all integers and rationals, respectively. 
For any integer n > 0, [n] denotes the set {1,2,...,n}, while 2l"] denotes the set 
of all subsets of [n]. log denotes log,. We also denote O(g) to be g- poly(log g). 


Sparsity of a polynomial f(a1,...,2n) € Flai,...,@n] over a field F, denotes 
the number of nonzero terms in f. 
A weight function w : [n] —> [ml], can be naturally extended to a set 


S €2l"l, by defining w(S) = Vicg w(i). 


Definition 1 (Subset Sum problem (SSUM)). Given (a1,...,an,t) € Z%5', 
the subset sum problem is to decide whether t is a realisable target with respect 
to (a1,...,@n), t.e., there exists S C [n] such that )>,-.a; =t. Here, n is called 
the size, t is the target and any S C [n] such that 7, a; = t is a realisable set 
of the subset sum instance. 


ies 
ieS 


Assumptions. Throughout the paper, we assume that ¢t > max a, for simplicity. 
Also, we work in the Turing model where basic operations like addition and 
multiplication over F, are not unit-cost unlike Word Ram model considered 
in [6], for simplicity; in the word RAM model our results will give slightly better 
result shaving one log p factor. 


Lemma 1 ((21, Isolation Lemma]). Let n and N be positive integers, and 
let F be an arbitrary family of subsets of [n]. Suppose w(x) is an integer weight 
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given to each element x € [n] uniformly and independently at random from [N]. 
The weight of S € F is defined as w(S) = do ,¢g w(x). Then, with probability at 
least 1—n/N, there is a unique set S' € F that has the minimum weight among 
all sets of F. 


Lemma 2 (Kane’s Identity [16]). Let f(x) = oar cx’ be a polynomial of 
degree at most d with coefficients c; being integers. Let F, be the finite field of 
order q= p* >d+2. For0<t<d, define 


r= S- xt! f(z) = — € Fy 


cers 
Then, rp =0 <= > c& is divisible by p. 


Lemma 3 (Newton’s Identities). Let X1,...,X» be n > 1 variables. Let 
Pr(X1,---,Xn) = a Xi}, be the m-th power sum and Em(X1,...,Xn) 
be the m-th elementary symmetric polynomials i.e. Em(@1,...,2n) = 
Vee einen Xj, +++ X35, then 


m 


tes Bra ( Mai 2cg Rn) =) (1)! Bin a Gey Xp) PA Migs Rn) 


i=l 
Lemma 4 (Vieta’s formulas). Let f(r) = i1(@ — ai) be a 
monic polynomial of degree n. Then, f(x) = oe were eee = 


(—1)'Ej(a1,...,@n), VI <i<n andc, =1. 


Lemma 5 (Polynomial division with remainder [22, Theorem 9.6}]). 
Given a d-degree polynomial f and a linear polynomial g over a finite field Fy, 
there exists a deterministic algorithm that finds the quotient and remainder of f 
divided by g in O(dlog p)-time. 


Definition 2 (Order ofanumber mod p). The order ofa( mod p), denoted as 
ord,(a) is defined to be the smallest positive integer m such that a™ =1 mod p. 


Note that when p is prime, ord,(a) is clearly finite since a?~! = 1 mod p, 
from Fermat’s Little Theorem. Emil Artin (1927, see [23]) conjectured that for 
any non-square a € Z\{—1}, there exist infinitely many primes p such that a is a 
primitive root modulo p, i.e. ord,(a) = p—1. There has been impressive amount 
of work done to understand behaviour and distribution of ord,(a) [24-26]. In 
particular, we have the following. 


Theorem 4 ((27]). There exists a O(p'/4**) time algorithm to determinstically 
find a primitive root over Fp. 


Theorem 5 ((28]). For n > 25, there is a prime in the interval [n,6/5- nl. 


Here is the most important lemma, which is an extension of [6, Lemma 4}, 
where the authors considered the simplest form. In this paper, we need the 
extensions for the ‘robust’ usage of this lemma (in Sect. 3). 


244 P. Dutta and M. S. Rajasree 


Lemma 6 (Coefficient Extraction Lemma). Let A(x) = [[jcj,j(1 + w?. 
x), for any non-negative integers a;,b and W € Z. Then, for a prime p > 


t, one can compute coef,r(A(x)) mod p for allO0 <r < t, in time O((n + 
tlog(Wb)) log p). 


3 Proof of Theorem 1 


We present an O(k(n + t))-time deterministic algorithm for outputting all 
the hamming weight of the solutions, given a Hamming — k — SSSUM 
instance i.e. there are only at most k-many solutions to the SSUM instance 
(Hiya de sigs Zon 


Proof of Theorem 1. We start with some notations that we will use throughout 
the proof. 


> Basic notations. Assume that the SSUM instance (a1,...,an,t) € Z&t" has 


exactly m (m < k) many solutions, and they have @ many distinct hamming 
weights w1,..., We; since two solutions can have same hamming weight, 0 < m. 
Moreover, assume that there are \; many solutions which appear with hamming 
weight w;, for 7 € [é]. Thus, vie AA=Mm<k. 

> Choosing prime gq and a primitive root yw. We will work with a fixed q in this 
proof, where g > n+k+t := M (we will mention why such a requirement 
later). We can find a prime q in O(n + k +t) time, since we can go over every 
element in the interval [M/,6/5- MM], in which we know a prime exists (Theorem 
5) and primality testing is efficient [29]. Once we find qg, we choose yz such that 
pe is a primitive root over F,, i.e. ordg(“) = q—1. This mw can be found in 
O((n+ k + t)!/4+©) time using Theorem 4. Thus, the total time complexity of 


this step is O(n +k +t). 
> The polynomials. Define the k-many univariate polynomials as follows: 


f(z) = [] G+u%2”) Vj € [k). 


i€[n] 
We remark that we do not know @ apriori, but we can find m efficiently. 


Claim 1 (Finding the exact number of solutions). Given a Hamming—k—SSSUM 
instance, one can find the exact number of solutions, m, deterministically, in 


O((n + t) log(q)) time. 


Proof. Use [6] (see Lemma 6, for the general statement) which gives a determin- 
istic algorithm to find the coefficient of x’ of T],¢;,) (1+ 2%) over Fy; this takes 


time O((n + t) log(q)). 


Since we know the exact value of m, we will just work with f; for 7 € [ml, 
which suffices for our algorithmic purpose. Here is an important claim about 
coefficients of x* in f;’s. 
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Claim 2. C; = coefyt(fj;(x)) = Yoiery Ap: we, for each 7 € [m]. 
J J i€[E] 


Proof. If S € [n] is a solution to the instance with hamming weight, say w, then 
this will contribute pi” to the coefficient of x! of f;(x). Since, there are 0 many 
weights w),...,we with multiplicity A1,...,A¢, the claim easily follows. 


Using Lemma 6, we can find C; mod q for each j € [m] in O((n+tlog(s1j)) log q) 
time, owing total O(k(n-+t)), since q = O(n+k+t), w < q—l, and do cpm log 7 = 
log(m!) < log(k!) = O(k). 

Using the Newton’s Identities (Lemma 3), we have the following relations, 
for 7 € [m]: 


i=1 
(1) 
In the. above, by (0 12.8"), we mean By 6g weigh 
ee eee 
A, times Ag times 
we... pm), and similar for P;. Since g > k, j~' mod q exists, and thus 
VS 
Ae times 


the above relations are valid. Here is another important and obvious observation, 
just from the definition of P;’s: 


Observation 1. For j € [k], Cj = Pj (w,..., pu") mod q. 


Note that we know Eo = 1 and P;’s (and j~' mod q) are already com- 
puted. To compute F;, we need to know £),...,#;~1 and additionally we need 
O(j) many additions and multiplications. Suppose, T'(j) is the time to compute 
E,,...,E;. Then, the trivial complexity is T(m) < O(k? log q)+O(k(n+t)). But 
one can do better than O(k? log q) and make it O(klogq) (i.e. solve the recur- 
rence, using FFT), owing the total complexity to T(m) < O(k(n + t)) (since 
q= O(n+k+t)). 

Once, we have computed E;, for 7 € [mJ], define a new polynomial 


m 


g(x) = S > (=1)) + Bj (ut, uP) a 


j=0 


Using Lemma 4, it is immediate that g(x) = Ti. (e@—p)*. Further, by defini- 
tion, deg(g) = m. From g, now we want to extract the roots, namely p?,..., pu” 
over F,. We do this, by checking whether (x — py") divides g, for i € [n] (since 
w; <n). Using Lemma 5, a single division with remainder takes O(k log q), 
therefore, the total time to find all the w; is O(nk log q) = O(nk). 

Here, we remark that we do not use the determinstic root finding or factoring 
algorithms (for e.g. [30,31]), since it takes O(mq!/?) = O(k- (k + t)'/?) time, 
which could be larger than O(k(n + t)). 
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> Reason for choosing qg and yw. In the hindsight, there are three important prop- 
erties of the prime q that will suffice to successfully output the w,’s using the 
above described steps: 


1. Since, Lemma 6 requires to compute the inverses of numbers upto t, hence, 
we would want q >t. 

2. While computing £j(u"',...,’*) using Lemma 3 in the above, one should 
be able to compute the inverse of all 7’s less than equal to m. So, we want 
q>m™m,. 

3. To obtain w; from py’ mod q, we want ord,(j) > n (for definition see Def- 
inition 2). Since, w; < n, this would ensure that we have found the correct 
Wj. 


Here, we remark that we do not need to concern ourselves about the ‘large- 
ness’ of the coefficients of C; and make it nonzero mod 4q, as required in [6]. For 
the first two points, it suffices to choose q > k +t. Since yz is a primitive root 
over F,, this guarantees that ord,(u) = q—1> 7 and thus we will find w; from 
b™* correctly. 


> Total time complexity. The time complexity to find the correct m,q and 
is O(n +k-+t). Finding the coefficients of g takes O(k(n + #)) time and then 
finding w; from g takes O(nk log q) time. Thus, the total time complexity remains 
O(k(n + t)). 


Remark 1. The above algorithm can be extended to find the multiplicities ,;’s 
in O(k(n +t) + k°/?) time by finding the largest \;, by binary search, such that 
(a — pW) divides g(x). Finding each \; takes O(mlog qlog(\;)) time over Fy, 
for the same gq as above, since the polynomial division takes O(m log q) time and 
binary search introduces a multiplicative O(log(A;)) term. Since, }7;<1y log(Ai) = 
log (Teta vi); using AM-GM, [Tic Ai S (m/0)*, which is maximized at 0 = 


J/m < Vk, implying Die log(Ai) < O(Vk log k). Since, m < k, this explains 
the additive k°/? term in the complexity. 


4 Proof of Theorem 2 


In this section, we will present a low space algorithm for finding all the real- 
isable sets for k — SSSUM. Our low space algorithms build upon a fundamen- 
tal number-theoretic identity [16], and efficient sparse multivariate polynomial 
reconstruction [13]. 


Proof of Theorem 2. Here are some notations that we will follow throughout 
the proof. 

> Basic Notations. Let us assume that there are exactly m (m < k) many 
realisable sets $1,...,Sm, each S$; C [n]. We remark that for our algorithm we 
do not need to apriori calculate m. 
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> The Multivariate Polynomial. For our purpose, we will be working with the 
following (n + 1)-variate polynomial: 


f(z, yiy---.9n) = TT G+ue%) 


i€[n] 


Since, we have a k — SSSUM instance (a1,...,@n,t), coef,+(f) has the following 
properties. 


1. It is an n-variate polynomial p;(yi,...,Yn) with sparsity exactly m. 
2. p, is a multilinear polynomial in y1,...,Y,, i.e. individual degree of y; is at 
most 1. 


3. The total degree of p; is at most n. 
4. if S C |n] is a realisable set, then ys := [],¢g yi, is a monomial in p,. 


In particular, the following is an immediate but important observation. 


Observation 2. p:(yi,---,Yn) = Viel] YS; - 


Therefore, it suffices to know the polynomial p;. However, we cannot treat y; 
as new variables and try to find the coefficient of x since the trivial multipli- 
cation algorithm (involving n + 1 variables) takes exp(n)-time. This is because, 
f(x, Y1,--+;Yn) mod x**! can have 2”-¢t many monomials as coefficient of 2’, 
for any 7 < ¢ can have 2” many multilinear monomials. 

However, if we substitute y; = c; € F,, for some prime q, we claim that we 
can figure out the value p;(c1,-.-, Cn) from the coefficient of a’ in f(x, c1,..-,Cn) 
efficiently (see Claim 3). Once we have figured out, we can simply interpolate 
using the following theorem to reconstruct the polynomial p;. Before going into 
the technical details, we state the sparse interpolation theorem below; for sim- 
plicity we consider multilinearity (though [13] holds for general polynomials as 
well). 


Theorem 6 ((13]). Given a black box access to a multilinear polynomial g(a1,..., 
Ln) of degree d and sparsity at most s over a finite field F with |F| > (nd)°, there is 
a poly(snd)-time and O(log(snd))-space algorithm that outputs all the monomials 


of g. 


Remark. We represent one monomial in terms of indices (to make it consistent 
with the notion of realisable set), i.e. for a monomial 212529, the corresponding 
indices set is {1,5,9}. Also, we do not include the indices in the space complexity, 
as mentioned earlier. 


> Brief Analysis on the Space Complexity of [13]. Klivans and Spielman [13], did 
not explicitly mention the space complexity. However, it is not hard to show 
that the required space is indeed O(log(snd)). [13] shows that substituting x; = 
y** mod P for some k € [2s2n] and p > 2s2n, makes the exponents of the new 
univaraite polynomial (in y) distinct (see [13, Lemma 3]); the algorithm actually 
tries for all k and find the correct k. Note that the degree becomes O(s?nd). 
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Then, it tries to first find out the coefficients by simple univariate interpolation 
[13, Section 6.3]. Since we have blackbox access to g(d1,...,@n), finding out 
a single coefficient, by univariate interpolation (which basically sets up linear 
equations and solve) takes O(log(snd)) space and poly(snd) time only. In the 
last step, to find one coefficient, we can use the standard univariate interpolation 
algorithm which uses the Vandermonde matrices and one entry of the inverse of 
the Vandermonde is log-space computable?. 

At this stage, we know the coefficients (one by one), but we do not know 
which monomials the coefficients belong. However, it suffices to substitute 7; = 
yk mod Pp. Using this, we can find the correct value of the first exponent in the 
monomial. For e.g. if after the correct substitution, y!° appears with coefficient 
say 5, next step, when we change just 2}, if it does not affect the coefficient 5, y; is 
not there in the monomial corresponding to the monomial which has coefficient 
5, otherwise it is there (here we also use that it is multilinear and hence the 
change in the coefficient must be reflected). This step again requires univariate 
interpolation, and one has to repeat this experiment wrt each variable to know 
the monomial exactly corresponding to the coefficient we are working with. We 
can reuse the space for interpolation and after one round of checking with every 
variable, it outputs one exponent at this stage. This requires O(log(snd)-space 
and poly(snd) time. 

With a more careful analysis, one can further improve the field requirement 
to |F| > (nd)® only (and not dependent on s); for details see [13, Thm. 5 & 11]. 

Now we come back to our subset sum problem. Since we want to reconstruct 
an n-variate m sparse polynomial p,; which has degree at most n, it suffices to 
work with |F| > n1?. However, we also want to use Kane’s identity (Lemma 
2), which requires gq > deg(f(,c1,..-,€n)) +2, and deg(f(a,c1,...,¢€n)) < nt. 
Denote M := max(nt+3,n'*). Thus, it suffices to we work with F = F, where g € 
[IM], (6/5) - M], such prime exists (Theorem 5) and easy to find deterministically 
in poly(nt) time and O(log(nt)) space using [29]. In particular, we will substitute 
yi = G € [0,¢— 1]. 


Claim 3. Fix c; € [0,q— 1], where gq € [M, (6/5) - M]. Then, there is a poly(nt)- 


time and O(log(nt)) space algorithm which computes p;(c1,...,¢n) over Fy. 


Proof. Note that, we can evaluate each 1+cjxz™, at some x = a € Fy, in O(log nt) 
time and O(log(nt)) space. Multiplying n of them takes O(n log(nt))-time and 
O(log(nt)) space. 


Once we have computed f(a,ci,...,¢n,) over F,, using Kane’s identity 
(Lemma 2), we can compute p;(c1,..., Cn), since 
pr(cr, cae ,€n) ae S- ot Fat, eae Cn) 8 
ack 


2 In fact Vandermonde determinant and inverse computations are in TC? Cc 
LOGSPACE, see [32]. 
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As each evaluation f(a,¢1,...,¢€n) takes O(nlog(nt)) time, and we need q — 
1 many additions, multiplications and modular exponentiations, total time to 
compute is poly(nt). The required space still remains O(log(nt)). 


Once, we have calculated p;(c1,..., Cn) efficiently, now we try different values 
of (c1,..., Cn) to reconstruct p; using Theorem 6. Since, p; is a n-variate at most 
k sparse polynomial with degree at most n, it still takes poly(knt) time and 
O(log(knt)) space. This finishes the proof. 


5 Conclusion 


This work introduces some interesting search versions of variants of SSUM prob- 
lem and gives efficient algorithms for each of them. This opens a variety of 
questions which require further rigorous investigations. 


1. Can we improve the time complexity of Theorem 2? Because of using Theorem 
6, the complexity for interpolation is already cubic. Whether some other 
algebraic (non-algebraic) techniques can improve the time complexity, while 
keeping it low space, is not at all clear. 

2. Can we use these algebraic-number-theoretic techniques, to give a determin- 
istic O(n +t) time algorithm for decision version of SSUM? 

3. Can we improve Remark | to find both the hamming weights w; as well as 
the multiplicities \;, in O(k(n + t)) time? 
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Abstract. We study several key variants of SMTI - Stable Marriage 
problem in which the preference lists may contain ties and may be 
incomplete. A matching is called weakly stable unless there is a man 
and a woman such that they are currently not matched with each other 
but if they get matched with each other, then both of them become 
better off. The COM SMTI problem is to decide whether there exists 
a complete (in which all men and women are matched) weakly sta- 
ble matching in an SMTI instance. It is known that the COM SMTI 
problem is NP-complete. We strengthen this result by proving that this 
problem remains NP-complete even for the instance SMTI-C, instance 
where members in each preference list are consecutive with respect to 
some orderings of the set of men and set of women. On the positive 
side, we give a polynomial time algorithm for COM SMTI problem for 
the instance SMTI-STEP, where the preference lists admit step-property, 
that is, preference list of every man m, is the set of all women w; such 
that 7 < 7 for some ordering of men and some ordering of women. Further, 
DECIDE_MAX SMTI (resp. DECIDE_MIN SMTI) is the decision ver- 
sion of MAX SMTI (resp. MIN SMTI), the problem of finding a weakly 
stable matching of maximum (resp. minimum) cardinality in an SMTI 
instance. Both DECIDE_MAX SMTI and DECIDE_MIN SMTI problems 
are known to be NP-complete. We improve these results by showing that 
DECIDE_MAX SMTI and DECIDE_MIN SMTI problems remain NP- 
complete even for the case where the preference lists admit inclusion 
ordering and even for the case where the preference lists admit step- 
property, respectively. Finally, we present a 3/2-approximation algorithm 
for the MIN SMTI problem with inclusion ordering. 


Keywords: Stable matching - Polynomial time algorithm - 
NP-complete - Approximation algorithm 


1 Introduction 


An instance of the Stable Marriage problem (SM) consists of nm men, n women, 
and their preference lists. A preference list of a man (resp. woman) is a set con- 
taining all women (resp. men) in strict order of preference. We denote preference 
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list of a member a by P(a). The task is to pair the men and women together 
such that there are no two individuals of the opposite sex who would both prefer 
each other over their current partners. When there are no such pairs, the set of 
marriages is said to be stable. Furthermore, a tie is a set of individuals which 
are preferred equally by some person. We use SMT to denote the variant of 
SM that may contain ties in the preference lists. Note that the preference lists 
are considered to be complete in SM as well as in SMT. Further, we use SMI 
to denote the variant of SM where preference lists may be incomplete, whereas 
SMTI stands for the variant of SM where the preference lists may be incomplete 
as well as may contain ties. An ordering of men (resp. women) is said to be 
consecutive ordering if members in each woman’s (resp. man’s) preference list 
are consecutive. An SMTI instance having consecutive ordering of men as well 
as of women is said to be consecutive, denoted by SMTI-C. An SMTI instance 
I is said to be inclusive, denoted by SMTI-INC, if the members of one of the 
two sets, say men, can be linearly ordered, i.e., m1, m2, ..., Mn such that, for 2, 
j =1ton, P(m,) C P(m,) if ¢ < 7. Furthermore, an SMTI instance satisfying 
the step property, that is, for all m; €¢ M, P(mi) = {w; € W|j < i} for some 
ordering of men and some ordering of women, is denoted by SMTI-STEP. 
Three notions of stability namely weak, strong, and super are established in 
the literature [4] when ties are allowed in the preference lists. A matching is 
called weakly stable if there is no man and woman such that they are currently 
not matched with each other but if they get matched together, then both of them 
would strictly improve. A matching is called strongly stable if there is no man and 
woman such that they are currently not matched with each other but if they get 
matched together, then one of them is better off and the other is not worse off. A 
matching is called super stable if there is no man and woman such that they are 
currently not matched with each other but if they get matched together, then 
both of them are not worse off. We define these notions formally in the next 
section. However, of three notions of stability in the literature, weak stability 
has received most attention till now. In this paper, we are solely concerned with 
weakly stable matching. Henceforth for the rest of the paper, in presence of ties, 
the terms stability and stable matching will be considered as weak stability and 
weakly stable matching, respectively, unless stated otherwise. Based on these 
variants, the following decision problems are identified in the literature. 


COM SMTI 

Instance: An SMTI instance, i.e., m men, n women, and their preference lists. 
Question: Does there exists a stable matching in which all men and women are 
matched? 

DECIDE_MAX (resp. DECIDE_MIN) SMTI 

Instance: n men, n women, their preference lists, and an integer k € ZT. 
Question: Does there exists a stable matching of cardinality atleast (resp. 
atmost) k in the given instance? 


MAX SMTI and MIN SMTI, the optimisation versions of DECIDE-MAX 
SMTI and DECIDE_MIN SMTI, respectively, are known to be NP-hard [6,8]. 
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These problems remain NP-hard even if the ties are present at the end of pref- 
erence lists and on one side only, each tie is of size (length) 2, and there is at 
most one tie per list [8]. Furthermore, COM SMTI is known to be NP-complete 
[6,8]. Also, COM SMTI remains NP-complete for the case when each preference 
list is of size at most 3 and ties occur on one side only [5]. It also implies the 
NP-hardness of MAX SMTI for this restricted case. Regarding the approxima- 
bility results, 3/2-approximation algorithm is known for MAX SMTI 7,9], but 
for MIN SMTI, no constant factor approximation has been identified in the lit- 
erature. Halldérsson et al. [2] proposed a (1 + OPrTy)-approximation algorithm 
for MIN SMTI, where t(I) is the number of preference lists that contain ties in 
the instance I of SMTI and OPT(I) is the optimal solution size of I, i.e., the 
cardinality of minimum size stable matching. 

In this paper, we present the first ever study of the Stable Marriage prob- 
lem involving ties and incomplete lists solely based on analyzing the pattern of 
preference lists. The following list summarizes our key contributions. 


1. We strengthen the NP-completeness result of COM SMTI problem by estab- 
lishing that this problem remains NP-complete for SMTI-C instance. 

2. We present O(n”) time algorithm for COM SMTI-STEP problem. 

3. We improve the NP-completeness result of DECIDE_MAX SMTI problem by 
showing that DECIDE_MAX SMTI-INC is NP-complete. 

4. Further, we prove that DECIDE_MIN SMTI-STEP is NP-complete, strength- 
ening the NP-completeness of DECIDE_MIN SMTI problem. 

5. Finally, we propose a 3/2-approximation algorithm for the MIN SMTI-INC 
problem. 


2 Preliminaries 


We define stable matching formally. Let M = {m1,me2,msz,...,7™n} and W = 
{w1, We, W3,-.-,Wn} be two sets, each of cardinality n, consisting of men and 
women, respectively. Each member of M and W has a preference list in which 
he/she ranks the members of opposite set in a decreasing order of preference. 
We say that a pair (m,w) is admissible if w is present in m’s preference list 
and m is present in w’s preference list. A matching M ‘ is a subset of M x W 
such that |M'(m;)| <1 for all m; € M and |M'(w;)| <1 for all w; € G, where 
M'(m;) denotes the set of women matched with m; and M (w;) denotes the 
set of men matched with w; in M’. Note that |M’(a)| can be either 0 or 1. If 
|M'(a)| = 1, ie., M’(a) = {b} where } is a person of opposite sex, then we say 
that a is matched with b in M’. Otherwise, if |M‘(a)| = 0, then we say that 
a is unmatched in M’. A complete matching is a matching in which all men 
and women are matched. A blocking pair of a matching ‘ is an admissible 
man-woman pair (m;,w;) € (Mx W)\M’ such that m, is unmatched or prefers 
w,; to his current partner, i.e., M “(mi) and w,; is unmatched or prefers m; to 
her current partner, i.e., M'(w;). A matching M’ is said to be stable if it has no 
blocking pair. The existence of a stable matching in SM is implied by the classical 


Hardness and Approximation Results for Some Variants of SMTI 255 


Gale-Shapley algorithm [1] given in 1962. The above definition can be extended 
to the case where preference lists may contain ties. A tie of an individual m’s 
preference list is a set of individuals whom m prefers equally, and the preference 
list of m is a strict order of ties. We say that m strictly prefers w1 to w2 (denoted 
by wi >m We), if wi is in tie T, and we is in tie Tz in m’s preference list, and m 
ranks 7, before T>. We write w1 =m W2, if w; and we are present in the same 
tie or if w, and wz are the same person. We say that m weakly prefers w, to we, 
if wy =m We OF W1 >m We holds, and write w,; >, we. When ties are involved, 
three notions of stability named weak, strong, and super are identified in the 
literature [4]. 

A weak blocking pair for a matching M’ is an admissible pair (m,w) ¢ M’ 
such that w >m M'(m) and m > w M'(w). A super stable matching is a matching 
that admits no weak blocking pair. 

A strong blocking pair for amatching M’ is an admissible pair (m,w) ¢ M’ such 
that either w >m M (m) and m >, M’'(w) or w >m M'(m) and m > M'(w). 
A strongly stable matching is a matching that admits no strong blocking pair. 

A super blocking pair for a matching M’ is an admissible pair (m,w) ¢ 
M' such that w >m M'(m) and m >, M'(w). A weakly stable matching is 
a matching that admits no super blocking pair. Since we are concerned with 
weakly stable matching in presence of ties, henceforth, for the rest of the paper, 
by a blocking pair, we mean a super blocking pair unless stated otherwise. 

A consecutive ordering of men is an ordering a =< m1,mM2,...,Mn > of 
the members of M such that for all w € W, the men in the preference list 
of w are consecutive. Consecutivity of women can be defined analogously. An 
SMTI instance is said to be consecutive, denoted by SMTI-C, if it admits a 
consecutive ordering of men as well as of women. Next, J is said to be inclusive 
if the members of one set, say men, can be linearly ordered, that is, m1, ma, ..., 
Mr, such that P(m1) C P(m2) C--- C P(m,). An ordering < m1, mo,...,7™n, 
W1,W2,---,Wn > of MUW is called an inclusion ordering if P(m,) C P(mg2) C 
-+-C P(m,) and P(w,) D P(w2) D--: D P(w,). Further, an SMTI instance I 
is said to possess step property if for all m; € M, P(m;) = {w; € W|j < 2} for 
some ordering of men and some ordering of women. An SMTI instance satisfying 
the step property is denoted by SMTI-STEP. 


3 Complete Stable Matching in SMTI-C and SMTI-STEP 


In this section, we show that the problem of finding a complete stable matching 
in an SMTL-C instance is NP-complete, whereas the same problem is O(n”) time 
solvable for an SMTI-STEP instance. 


3.1 COM SMTI-C Problem 


We prove that COM SMTILC is NP-complete by giving a polynomial reduction 
from the EXACT-MM problem for bipartite graphs which asks whether, given 
a graph G and a positive integer k, there exists a maximal matching of size 
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exactly k in G. The NP-completeness of EXACT-MM problem for bipartite 
graphs follows from MIN MM-D which is known to be NP-complete for bipartite 
graphs [10], where MIN MM-D is the decision version of MIN MM, the problem of 
finding a minimum cardinality maximal matching in a graph. Note that one can 
obtain a polynomial time reduction from the MIN MM-D problem to EXACT- 
MM problem by making use of the fact that maximal matchings satisfy the 
interpolation property, i.e., G has a maximal matching of size s, for Mmin <s < 
Mmax, Where Mmin aNd Mmaz are the sizes of minimum maximal matching and 
maximum matching in G, respectively. 


Theorem 1. The COM SMTI-C problem is NP-complete. 


Proof. Given a matching M of an SMTI-C instance, it can be easily verified in 
polynomial time whether M is complete and stable or not. Hence COM SMTI- 
C problem is in NP. We give a polynomial reduction from the EXACT-MM 
problem which remains NP-complete for bipartite graphs. 

Let Ig = (G,k), where G = (X UY, E) is a bipartite graph with X = 
{%1,@2,%3,...,p)} and Y = {y1, ya, ys,---, Yq}, be an EXACT-MM instance. If 
k > min{p,q}, then the EXACT-MM instance would not admit any maximal 
matching of size exactly k. Therefore, we assume that k < min{p, q}. 

We construct an instance Ig of COM SMTI-C problem by using following 
steps. 


1. Let XUZU{m} and YUW U {w} be the set of men and women, respectively, 
where Z = {21, 22, 23,---,2q—k} and W = {w1, wo, w3,..., Wp—r}- 

2. Let Y; (resp. X,;) be the set of vertices in Y (resp. X) which are adjacent to 
x; € X (resp. y; € Y). Create preference list for each person as follows: 


Men: m:w 
(l<i<q-k) %:(Y) 
Q<i<p) 4: (¥) (Ww (V\K) 
Women: w: (XxX) m 
(l<j<q) Yj + (Xj) (Z) (X\X5) 
(l<j<p—k) w;:(X) 


In a preference list, the symbol (T) denotes a tie consisting of all members 
of T. Clearly, this constrution can be completed in polynomial time. Further 
since, < M,%1,%2,...,Xp, 21, 22,--+,2%g-k > is a consecutive ordering of men 
and < W,Y1,Y2.-++5 Yq, W1,W2,---,Wp—k > is a consecutive ordering of women 
in Ig. Therefore, [co is an instance of COM SMTI-C problem. An illustration of 
the construction of instance Ic from a bipartite graph G is shown in Fig. 1. 


Claim. G has a maximal matching of size exactly k iff Ic has a complete stable 
matching. 


Proof. Let M be a maximal matching of size exactly k in G. Define M. = M 
U {(m,w)} U {(a,,wi) | 1 <i < p— k}U {(zj,y;) | 1 < 7 <q — k} where 
Lq,’8 (resp. yp,’s) are the men (resp. women) who are unmatched w.r.t. M with 
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vy Yi m:w w : (a, v2 23) m 
r1: (Yi Ye ys) Wi W Ya yi: (%1 @3) (21 22) 2 

x2 y2 2: y2 Wi W (Ys Ys Y4) yo: (a1 @) (21 22) x3 
rg: (Yr ys Ya) Wi W Ye y3 : (v1 @3) (21 22) v2 

U3 YZ Zz: (41 Y2 Y3 Ya) ya : @3 (21 22) (x1 22) 
22: (Y1 Y2 Ys ya) wy: (a1 @ #3) 

Ya 
e Ie (k=2) 


Fig. 1. An illustration of the construction of instance Ic from a bipartite graph G. 


ay < ag <... < Ap_x (resp. b) < by < ... < bg_%). We show that M, is complete 
stable matching in Ig. Since all men and women are matched in M., so M, is 
complete. Further, since m, z;(1 < j < q—k), and x; € M are matched to 
their first choice woman, so none of them can form a blocking pair in M,. So, let 
(2u,Yv) be a blocking pair of M, for some x, which is unmatched with respect to 
M and y, € Y,,. But such a y, € Y,, is already matched with her first preference, 
as M is maximal. Therefore, such a blocking pair is not possible. Hence no man 
can participate in any blocking pair of M,. Therefore, M; is a stable matching. 

Conversely, suppose M, is a complete stable matching in I. Note that m must 
be matched with w in M, as M, is complete. Also, note that no x;(1 <i < p) 
can match with any woman in Y\Y; because if it is so, then (a;,w) blocks 
M.. Define M = M,\M, where M; = {(2;,49,;) |1 <9 < a —k}U {(an,, wi) 
|1<i<p—k}U {(m,w)}, where y,, (resp. xp,) is the woman (resp. man) who 
is matched with man z; (resp. woman w;) in M,. Note that |M| = |M_-|—|M.| = 
(ptq—k+1)—((q—k)+(p—k)+1) = k. Further, we show that the matching M is 
maximal in G. Assume M is not maximal in G. Then M U{(a,, ys) } is a matching 
in G for some (2,,ys) € E with x, € X and y, € Y. Hence (zg,y,) € M, for 
some zg € Z and (2,,Wa) € M. for some wa € W because M, is complete. 
But this implies (x,, ys) is a blocking pair of M., a contradiction. Hence M is 
maximal in G. 


Hence, the theorem is proved. 


3.2 COM SMTI-STEP Problem 


We present a polynomial time algorithm to find a complete stable matching in 
an SMTI-STEP instance, if it exists. 


Theorem 2. Algorithm 1 gives a complete stable matching in an SMTI-STEP 
instance or reports that no such matching exists in O(n?) time. 


Proof. First note that the orderings < 21,...,%, > and < yj,...,Yn > can 
be found by arranging the men (resp. women) in decreasing (resp. increasing) 
order of the size of their preference lists. This can be done by using bucket sort 
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Algorithm 1. COM SMTI-STEP 


Input: An SMTLSTEP instance J containing n men 21, r2,... , 2, and n women 
Yl, Y2, +--+ 5 Yn- 
Output: A complete stable matching in J or report that none exists. 
begin 
1: M= ¢; 


2: fori = 1 to n do 
M=MU {(ai,yi)}s 
: end for 
4: Check whether M is stable; 
If yes, output MW. 
Else, output no complete stable matching. 
end 


w 


algorithm in atmost O(n?) time. Further let the instance I has a complete stable 
matching M,. Note that M, must be unique and M, = {(2;,y;)|1 < i < n}. 
Because if x; is matched with some woman other than y; in Mj, then there 
exists atleast one pair of man and woman who are unmatched in M, and are 
not admissible to each other. Hence M, will not be complete. 


Claim. Algorithm 1 outputs M4. 


Proof. After completing step 2, the algorithm constructs matching M which is 
same as Mj), and since M, is stable, so is M. Therefore, following step 3, the 
algorithm outputs M which is same as Mj. 


Now, suppose J has no complete stable matching. This implies {(x;, y;)| 1 < 
i <n} must not be stable as this is the only complete matching in J. Therefore, 
step 4 of algorithm reports that no complete weakly stable matching exists. 

Note that the algorithm clearly terminates. Also, step 1 and 2 take O(1) 
and O(n) time, respectively. Further we can check whether a matching is sta- 
ble or not in O(n?) time. Therefore, overall time complexity for the algorithm 
is O(n”). 


4 Maximum Stable Matching in SMTI-INC Problem 


Theorem 3. The DECIDE_MAX SMTI-INC problem is NP-complete. 


Proof. Given a matching S of an SMTI-INC instance and a positive integer kj, it 
can be easily verified in polynomial time whether S' is stable and |.$| > k,. Hence 
DECIDE_MAX SMTI-INC is in NP. To show NP-hardness, we give a polynomial 
reduction from the MIN MM-D problem which is known to be NP-complete for 
subdivision graphs of cubic graphs [3]. Note that the subdivision graph of a graph 
H is a graph G obtained by replacing each edge of H with a 2-length path. Let 
Imin = (G,k), where G is a subdivision graph (and hence a bipartite graph) 
of some cubic graph H, be a MIN MML-D instance. Let G = (X UY, F), where 
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X = {%1,©2,3,...,tp}, and Y = {y1, y2,y3,---, Yq}. Without loss of generality, 
suppose k < min{p, q}. 

We construct an instance Inq, of DECIDELMAX SMTI-INC by using fol- 
lowing steps. 


1. Let X’ = LCS Chet, peers meng ats Y= 4 e+is Vor dr~a Yotn > Suppose the 
set of men and women in Ima, be X UX “and YU Y, respectively. 

2. In G, let Y; be the set of vertices in Y that are adjacent to x; € X and let 
X,; be the set of vertices in X that are adjacent to y; €¢ Y. Create preference 
list for each person as follows: 


Men: (1<i<p) 0s Ye) Wyse Weta A Wage on Yaa (LY XG) 
(D+1<iSpta) 42 Ying W\C-y)) 

Women: (1 <j <q) Yj + (Xj) @p4y (X \{ep4y}) (X\X5) 
(g@+leg sep) Yj + Vj—q Vj—qti Vj—qt2 + Up 


Clearly, this constrution can be completed in polynomial time. Since, 
N(tp41) C N(apt2) C .. C N(tp4q) C N(a1) C N(a2) C ... C N(ap) and 
N(q1) 2 N(y2) 2» 2 NWq) 2 NWqt1) 2 N(Yq+2) 2 --» 2 N(Yqtp). There- 
fore, this is an instance of DECIDE-MAX SMTI-INC problem, with parameter 
ky = p+q-—k (let). An illustration of the construction of instance Ina from the 
bipartite graph G which is the subdivision graph of a cubic graph H is shown 
in Fig. 2. 


Yi Ys 
©1 + (Yr Yr) Ys (Ys Ya) yi: (a1 Lo Xe) 7 (xg To L19) (x3 Ta 5) 
Y2 Ya t2: (y1 ys) Y Ye Y5 (yo ya) Yy2: (x1 La X5 5) xg (x9 T10 X7) (x2 v3 x6) 
Nas U3: (ys Ya) Yr Yo Ys (y yo) Y3 : (x2 U3 5) To (x10 Xz Ze) (x1 X4 x6) 
es es i ai a te ©: (Yo Ya) Ys U7 Yo Ys (Mi Ys) Ya: (€3 4 Le) L1p (7 Lg Lo) (& Ly 5) 
Xs: (Y2 Y3) Yo Ys U7 Ye Ys (Yi Ya) Ys + U1 L203 T4 Ly LG 
Tp: (Yr Ya) Yio Yo Ys Y7 Yo Ys (Yo Ys) Yo: To Lz L4 Ly Xe 
rr: yt (Yo Ys y4) Y7 + ©3 4 U5 TE 
Tg: Yo (Y3 Ya Yr) Yg: U4 V5 XM 
To: ys (Ys Ya Yr) Yo: Ts 6 
w Y2 Y3 Ya T1090: Ya (y Y2 ys) Yio: 6 
G ye 


Fig. 2. An illustration of the construction of instance Imax from the bipartite graph G 
which is the subdivision graph of a cubic graph H. 


Claim. G has a maximal matching of size atmost k iff Imax has a stable matching 
of size atleast ky. 


Proof. The proof is omitted due to space constraint. 


Hence, the theorem is proved. 
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5 Minimum Stable Matching in SMTI-STEP Instance 


Theorem 4. The DECIDE_MIN SMTI-STEP problem is NP-complete. 


Proof. Given a matching S of an SMTI-STEP instance and a positive integer 
ky, we can easily verify in polynomial time whether S is stable and || < ky. 
Hence DECIDE_MIN SMTI-STEP problem is in NP. To show hardness, we give 
a polynomial reduction from DECIDE MAX SMTI-INC problem which we have 
shown to be NP-complete. Let Imax = (M-eUW.,k) be a DECIDE MAX SMTI- 
INC instance with parameter k, that consists of n men, nm women, and their 
preference lists. So, let M. = {m1,mo2,m3,..., Mn}, Wo = {w1, we, W3,---,;Wn} 
and < m1,™Mg,.--,;Mn, W1,W2,---,Wn > be the inclusion ordering. Without loss 
of generality, we can assume that k <n, otherwise the DECIDE_MAX SMTI- 
INC instance would result in a ’no’ instance. 
We construct an instance Ijin of DECIDE_MIN SMTI-STEP as follows. 


1. Corresponding to each man m; € Me and each woman w; € Wo in Imaz, 
we create aman %4; and a woman y;, respectively, in Imin. So, let X U M, 
and W, UY be the set of men and women, respectively, in Imin, where X = 
{x1,@2,%3,...,Un}, M, = {n41,2n42, vy Lan}, W, = {y1, Y2, Y3,--; Yn}, and 
Y = {Yn41, Yn+2)-+; Yon}. Note that M, and W, are created from M. and 
W,, respectively. 

2. Let M; (resp. W;) be the preference list of man m, (resp. woman w;) in Imaz- 
Let M, denote the list obtained by changing each w, present in M; to yp. 
Also, let W; denote the list obtained by changing each m; present in W; to 
In+1. Create preference lists for each person in Ijin as follows: 


Tit Yi Yor YI (1<i<n) 
Inti? Mz Ynti Ynti-1 Ynti-2 --- Ynti (Wo\M;) (1 <i<n) 


Yj 1 W; 5 @j41 Tj42 Mn (M\W;) (lL <5 <n) 
Uiay thee Oty Meggan Be ~ (7 eM) 


Clearly, this construction can be completed in polynomial time. Also we can 
easily note that the above created instance is a DECIDE_MIN SMTLSTEP 
instance, with parameter kj = 2n — k (let). An illustration of the construction 
of instance Ijin from the instance Ijaz is shown in Fig. 3. 


Claim. Imax has a stable matching of size atleast k iff Imin has a stable matching 
of size atmost ky. 


Proof. The proof is omitted due to space constraint. 


Hence, the theorem is proved. 
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My: Wy Bi Yi Yi: 16% X7 XE -«X1 « LQ TZ T4 
M2: Wi W2 (W3 Ws) t TQ: Y2 Yi Yo: U7 Xgl T2134 5 
m3 : wi (we w3) Wa : U3 Y3 Y2 YL Y3: %7 TG UTA, U5 
M4: (W1 W2 Ws; wa) x T. Ys Y3 Y2 YW Ya: Ugrt7% £4 2X5 

rte t yn ys (2 Ys Y4) Y5 i Ty Lo Ly Xs 

Ww, + m2 ™M, M3 M4 ; Xe: Yi Y2 (Y3 Ys) Yo Ys Yeo: LG T7 Lg 
W2 : m3 M4, M2 : Ty (y2 Y3) Ya «YT Ye Ys Y7 : 27 XE 


W3 > m3 M4 Mz rg (Yr Yo Ys Ys) 


Ws) M4 M3 M2 


Ys U7 Ye Ys Yg : Xg 


Imax Iinin 


Fig. 3. An illustration of the construction of instance Imin from the instance Imaz in 
the proof of Theorem 4. 


6 


Approximation Algorithm for MIN SMTI-INC 


In this section, we present a 3/2-approximation algorithm for the MIN SMTI- 
INC problem, as the NP-hardness of this problem clearly follows from the MIN 
SMTI-STEP problem. 


Algorithm 2. MIN SMTI-INC APPROX 


Input: An SMTLINC instance J containing n men 21, 22, ... 
Y1, Y2,---5 Yn- 

Output: A stable matching of size atmost 3/2 times the cardinality of minimum 
size stable matching in I. 

begin 


, tn and n women 


1: Find an inclusion ordering < ™m, ..., Mn; Wi, -» Wn > inl. 

Let I’ be the instance obtained after finding the inclusion ordering of men and 
women in J and changing their preference lists accordingly. 

2: If datie T = (wWa,;, Wag; ---s Wap) in any man m;’s list in I, then 
break the tie in the following way: m; prefers Wa, tO Wap, if ag < 
Qh. 

If Jatie T = (ma,, Me, ---, Mg,) in any woman w;’s list in I; 
then break the tie in the following way: w; prefers mg, to mg, if Bu 
> Bu; ; 

Let I be the modified instance resulted from above breaking of ties in I . 

3: Apply Gale-Shapley algorithm in J to find a stable matching, say Mj. 

4: In M,, change the corresponding m’s to x’s and w’s to y’s as in the 
original instance I, which were changed in step 1. 

Name the matching containing x’s and y’s obtained by this transformation of m’s 
to z’s and w’s to y’s as M. 

5: return M 

end 


Theorem 5. Algorithm 2 outputs a stable matching of size at most 3 -|Moprt| 
in an SMTI-INC instance in O(n?) time, where Mopr is the minimum size 
stable matching. 
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Proof. Since the Gale-Shapley algorithm always returns a stable matching in an 
SMI instance in polynomial time, the matching M returned by algorithm 2 is 
stable. 

Let Mopr be an optimal matching, that is, minimum size stable matching 
of the instance I. Let G = (V,£) be the corresponding bipartite graph of I - 
Now each component of G/MAMopz7] is either a path or an even cycle (having 
equal number of edges from M and Mopr). We denote any Mo pr-augmenting 
path of length 3 as 3AP. Note that any 3AP contains two edges of M and one 
edge of Mopr. We prove that G/MAMopr] has no 3AP. 


Claim. G|[MAMopt| has no 3AP. 


Proof. Note that 3AP can be of four types as shown in Fig. 4, where the black 
edges and blue edges are from M and Mogpr, respectively as Mopr is a minimum 
size stable matching. 


Mm Wj Mm; Wj Mm M4 
Wj Wj 
Mitk Mitk 
Mitk Wj+1 Mitk Wi+l W541 Wi4+l 
Type I Type IT Type II Type IV 


Fig. 4. Types of Mopr-augmenting paths of length 3. 


We will show that none of these four types of 3AP are possible. 

First assume that GLMAMopz] has a Type I 3AP. Then w; does not prefer 
m, over Mit~ else Mopr will not be stable. Due to similar reason, m;+,% does 
not prefer w j4, over w;, else (mi+x,W;+1) will be a blocking pair for Mopr, a 
contradiction. Also, if both w; and mi, prefer each other over m; and w;4u, 
respectively, then the matching M reported by algorithm will not be stable. 
Hence either (i) w; prefers m;+,% over m;, and mj;+,% is tied between w, and w;+1 
or (ii) w; is tied between m; and mj+%, and m;+% prefers w; over w;+1, or (iii) 
wy; is tied between m; and mip, and mi+z is tied between w; and w;+1. 
Case I: w; prefers mj_% over m;, and m+ is tied between w,; and w;+1. 

After step 2 of algorithm, mj;4, prefers w; over wj41, as j < j +1. This 
implies (mi+x,w,;) is a blocking pair of 1M, a contradiction. 

Case II: w; is tied between m; and mix, and mix prefers w; and w;+1. 

After step 2 of algorithm, w; prefers mii, over m;, as i+k > i. This implies 
(mi+n,W;) is a blocking pair of M, a contradiction. 

Case III: w; is tied between m; and mj+,, and mj, is tied between w; and 
Wi+1- 

After step 2 of algorithm, w,; prefers m+, over m;, and m+, prefers w; over 
w 741. This implies (mj+%,w,) is a blocking pair of M, a contradiction. 
Therefore, Type I 3AP is not possible. 
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Next assume that Type II 3AP is possible. Since P(m;) C P(miiz), so 
(mi+n,w;) edge must be in G and hence this edge blocks Mopr, a contradiction. 
Assume that Type III 3AP is possible. Since P(m;) C P(mi+,%), so 
(mMi+n,wWz41) blocks Mopr, a contradiction. 

Assume that Type IV 3AP is possible. Since P(wj41) C P(w;), so (mi,w;) 
edge must be in G and hence this edge blocks Mopr, a contradiction. 

Hence G[MAMopz7] has no 3AP. 


Also, any other Mopr-augmenting path in MAMopr will have length at least 
5 (containing two edges of Mopr and three edges of M). 


Claim. |M| SS 3 : |Mopr|. 


Proof. Let d1, dz, ..., d be the components of G/M AMop7] which are Mopr- 
augmenting paths. Let m; (1 <i <r) be the number of edges of Mopr in dj. 
This implies number of edges of M in d; are m;+ 1. Further, let cj), co,..., Cs 
be other components of MAMopr which contribute equally to both M and 
Mopr. Let n; (1 < j < s) be the number of edges of Mopr, and hence of M, 
in c;. By above claim, m; > 2 Vi = 1 to r, so 


[Ml Vial +Y)+ U5 _ r 


= <14 
|Mopr| Die Mi + ja 7 


ie Mi eS Nj 2 


Furthermore, each of step of the algorithm takes atmost O(n?) time. There- 
fore, the running time of the algorithm is O(n”). Hence, the theorem is 
proved. 


7 Conclusion 


We have proposed a first ever study of SMTI problem by analysing the pattern of 
the involved preference lists. We have strengthened the NP-completeness result 
of COM SMTI problem by showing that this problem remains NP-complete 
for SMTI-C instance. Also, we have improved the NP-completeness result of 
DECIDE_MAX SMTI problem by establishing that DECIDE_MAX SMTI-INC 
problem is NP-complete. Furthermore, for an SMTI-STEP instance, we have 
shown that the problem of finding minimum size stable matching is NP-hard 
whereas the problem of finding a complete stable matching is polynomial time 
solvable. Finally, we have proposed the first constant factor approximation algo- 
rithm linked with MIN SMTI problem by presenting a 3/2-approximation algo- 
rithm for the MIN SMTLINC problem. It remains open to further improve the 
approximability bounds for the MIN SMTI problem. 


264 


B. S. Panda and Sachin 


References 


10. 


Gale, D., Shapley, L.S.: College admissions and the stability of marriage. Am. 
Math. Mon. 69(1), 9-15 (1962) 

Halldérsson, M.M., et al.: Approximability results for stable marriage problems 
with ties. Theoret. Comput. Sci. 306(1-3), 431-447 (2003) 

Horton, J.D., Kilakos, K.: Minimum edge dominating sets. SIAM J. Discret. Math. 
6(3), 375-387 (1993) 

Irving, R.W.: Stable marriage and indifference. Discret. Appl. Math. 48(3), 261— 
272 (1994) 

Irving, R.W., Manlove, D.F., O’Malley, G.: Stable marriage with ties and bounded 
length preference lists. J. Discrete Algorithms 7(2), 213-219 (2009) 

Iwama, K., Miyazaki, S., Morita, Y., Manlove, D.: Stable marriage with incomplete 
lists and ties. In: Wiedermann, J., van Emde Boas, P., Nielsen, M. (eds.) ICALP 
1999. LNCS, vol. 1644, pp. 443-452. Springer, Heidelberg (1999). https://doi.org/ 
10.1007 /3-540-48523-6_41 


. Kiradly, Z.: Linear time local approximation algorithm for maximum stable mar- 


riage. Algorithms 6(3), 471-484 (2013) 


. Manlove, D.F., Irving, R.W., Iwama, K., Miyazaki, $., Morita, Y.: Hard variants 


of stable marriage. Theoret. Comput. Sci. 276(1—2), 261-279 (2002) 


. Paluch, K.: Faster and simpler approximation of stable matchings. Algorithms 


7(2), 189-202 (2014) 
Yannakakis, M., Gavril, F.: Edge dominating sets in graphs. SIAM J. Appl. Math. 
38(3), 364-372 (1980) 


qy 


Check for 
updates 


On Fair Division with Binary Valuations 
Respecting Social Networks 


Neeldhara Misra‘) and Debanuj Nayak 


Indian Institute of Technology, Gandhinagar, India 
neeldhara.m@iitgn.ac.in, debanuj.nayak@alumni.iitgn.ac.in 


Abstract. We study the computational complexity of finding fair allo- 
cations of indivisible goods in the setting where a social network on the 
agents is given. Notions of fairness in this context are “localized”, that 
is, agents are only concerned about the bundles allocated to their neigh- 
bors, rather than every other agent in the system. We comprehensively 
address the computational complexity of finding locally envy-free and 
Pareto efficient allocations in the setting where the agents have binary 
valuations for the goods and the underlying social network is modeled by 
an undirected graph. We study the problem in the framework of param- 
eterized complexity. 

We show that the problem is computationally intractable even in fairly 
restricted scenarios, for instance, even when the underlying graph is a 
path. We show NP-hardness for settings where the graph has only two 
distinct valuations among the agents. We demonstrate W-hardness with 
respect to the number of goods or the size of the vertex cover of the under- 
lying graph. We also consider notions of proportionality that respect the 
structure of the underlying graph. 


Keywords: Fair division - Social networks - Envy-freeness - 
Parameterized complexity 


1 Introduction 


The problem of fairly allocating resources among a set of agents with (possibly 
distinct) interests in said resources is a fundamental problem with important 
and varied practical applications. We focus on the problem of allocating indivis- 
ible items: in this setting, we have n agents and m resources, and every agent 
expresses their utilities for the resources, either as a ranking over the resources 
or by specifying a valuation function. The goal is to determine an allocation of 
the items to the agents that respects some notion of “fairness” and “efficiency”. 
We use the term bundle to refer to the set of items that an agent receives in an 
allocation. 

Envy-freeness is one of the most widely used notions of fairness. Given an 
allocation, an agent envies another if it perceives the bundle of the other agent 
to be more valuable than her own. An allocation is envy-free if no agent envies 
© Springer Nature Switzerland AG 2022 
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another. Note that the trivial allocation that leaves every agent empty-handed 
is always envy-free. Therefore, one is typically interested in fair allocations that 
also satisfy some criteria of economic efficiency, such as completeness (every good 
should be allocated to some agent), non-wastefulness (no agent receives a piece 
of cake that is worth nothing to her and worth something to another agent), 
or Pareto-efficiency (there is no other feasible agreement that would make at 
least one agent strictly better off while not making any of the others worse 
off). We remark here that just as there are trivial allocations that are fair, it is 
also possible to trivially achieve efficiency if we had no fairness considerations 
involved: for instance, the allocation that gives all goods to a single agent is 
Pareto-efficient assuming that the agent has a strictly monotonic utility function 
over the items. 

The question of finding allocations that respect fairness and efficiency 
demands simultaneously is non-trivial: in particular, such allocations may not 
exist (if there are two agents and one good, and both agents have positive utility 
for this single resource), and can be computationally hard to find (for instance, 
the problem of finding a complete envy-free allocation between even two agents 
who hold identical valuations over m goods is equivalent to the PARTITION prob- 
lem). 

The focus of this work is the notion of local envy-freeness. In this setting, the 
agents are related by a graph, which might be thought of as modeling a social 
network over the agents, and we explore notions of fairness that account for the 
structure of this network. For instance, the notion of envy is now restricted: it 
only manifests between agents who are friends in the network. This is a com- 
pelling model of fairness, since agents are likely to not envy agents about whom 
they have little or no information. We note that the problem of fair division 
respecting a social network generalizes the classical notion, which can be cap- 
tured by considering a complete graph on the agents. Thus, the problem of find- 
ing allocations that are “locally fair” is a generalization of the classical allocation 
problem. 


1.1 Related Work 


The model of local envy-freeness has been proposed and considered in several 
recent lines of work. Some of the earliest considerations for incorporating a graph 
structure on the agents were made in the context of the cake-cutting prob- 
lem, which is the closely related setting of allocating a divisible resource among 
agents [1,4]. Abebe et al. [1] consider both directed and undirected graphs and 
focus on characterizing the structure of graphs that admit algorithms with cer- 
tain bounds. They also consider the issue of the price of envy-freeness in this 
setting, which compares the total utility of an optimal allocation to the best 
utility of an allocation that is envy-free. Bei et al. [4], on the other hand, pro- 
pose a moving-knife algorithm that outputs an envy-free allocation on trees and 
an algorithm for computing a proportional allocation on descendant graphs. 
We now turn to the literature in the context of indivisible items. Beynier 
et al. [5] study the fair division problem in the setting of “house allocation”: 
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here agents have (strict) preferences over items, and each agent must receive 
exactly one item. An agent envies another in this setting if she prefers the item 
received by the other agent over her own. In the case of a complete network, 
for an allocation to be envy-free, each agent must get her top object, and this 
assignment is automatically Pareto-efficient as well. This motivates the setting of 
local envy-freeness with respect to a graph on the agents. The authors consider 
the case when the underlying graph is undirected, and they also consider a 
variant of the problem where agents themselves can be located on the network by 
the central authority. These problems turn out to be computationally intractable 
even on very simple graph structures. 

Bredereck et al. [10] consider the problem of graph-based envy-freeness in the 
context of directed graphs and for various classes of valuations: including binary, 
identical, additive, and even valuations that are both identical and binary. They 
also consider the complexity of the allocation problem in the framework of param- 
eterized complexity.| Somewhat surprisingly, it turns out that finding complete 
envy-free allocations in the setting of a graph is NP-hard even when the valuations 
are binary and identical. Note that in this setting, every agent in every strongly 
connected component must get the same number of items: thus, the allocation 
problem is trivial for directed graphs that are strongly connected, but NP-hard for 
general directed graphs. Also, it turns out that for general binary preferences, the 
problem of finding a complete envy-free allocation is NP-hard even when the graph 
is strongly connected. The problem is also tractable for DAGs: indeed, allocating 
all resources to a single source agent (corresponding to a vertex with no incoming 
arcs) is both complete and locally envy-free since nobody can envy a source agent, 
and empty-handed agents have no envy for each other. 

More recently, Eiben et al. [13] consider the problem of finding locally envy- 
free allocations and envy-free allocations that are additionally proportional in 
the setting of directed graphs in the framework of parameterized complexity, and 
specifically considering parameters such as treewidth, cliquewidth, and vertex 
cover — all of these reflect the structure of the underlying network. It turns out 
that the problem of finding fair and efficient allocations is tractable for networks 
that have bounded values for these parameters with some additional assumptions 
that bound the number of item types or the size of the largest bundle received by 
an agent. The authors also show hardness results in both the parameterized and 
classical settings. For instance, the authors show that finding a locally envy-free 
allocation is NP-hard even when the underlying network is a star, but we note 
that this is in the setting of general utilities. 

The work of Bredereck et al. [8,9] demonstrates that the problem of find- 
ing fair and efficient allocations in various settings (including graph-based con- 
straints) is fixed-parameter tractable in the combined parameter “number of 
agents” and “number of item types” for general utilities. In contrast, our work 
here focuses on smaller parameters for the special case of binary utilities. 

In [11], Chevaleyre, Endriss, and Maudet consider distributed mechanisms 
for allocating indivisible goods, in which agents can locally agree on deals to 


1 The terminology relevant to this framework is introduced in the next section. 
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exchange some of the goods in their possession. This study focuses on conver- 
gence properties for such distribution mechanisms both in the context of the 
classical setting and the setting involving social constraints coming from an 
underlying undirected graph. Here, the notions of fairness localized according to 
the graph, and the network also constraints the exchanges that can take place — 
agents can engage in an exchange only if they are friends in the network. There 
are also some lines of work that suggest eliminating envy by some mechanism 
for hiding information [14]. 


1.2 Our Contributions 


Our focus in this paper is on the setting when agents have binary valuations 
over the goods and the underlying social network is modeled by an undirected 
graph. Our focus is on exploring the computational complexity of finding locally 
envy-free allocations that are also Pareto efficient (EEF) in the framework of 
parameterized complexity, building most closely on the works of [6, 10,13). 


Bounded Agent Types. We begin by noting that the setting of undirected graphs 
can be significantly different from their directed counterparts: indeed, recall that 
finding a complete and locally envy-free allocation was NP-hard for even identical 
binary valuations for directed graphs, but the analogous question is easily seen 
to be tractable for undirected graphs (indeed, observe that the notions of strong 
connectivity and connectivity coincide). This motivates the question of whether 
the problem of finding locally EEF allocations is easier for undirected graphs 
with a bounded number of agent types. We answer this question in the negative 
by showing that the problem of determining locally envy-free allocations is NP- 
hard even when there are only two distinct binary valuations among the agents 
by a reduction from a graph separation problem called CUTTING £ VERTICES 
(Theorem 1). 


Sparse and Dense Graphs. In contrast with the result for DAGs, we show that 
finding locally envy-free allocations that are Pareto efficient (EEF) is NP-hard 
even when the underlying graph is a path (Theorem 4 and Corollary 1). Although 
Beynier et al. [5] also show hardness results for very sparse graphs, we note that 
our methods are significantly different since the models for the valuations are 
different and additionally, the allocations we seek need not give every agent 
exactly one item. Moving away from sparsity, we recall that finding complete 
envy-free allocations for binary valuations is known to be NP-hard even for 
complete graphs [3,14], which justifies the need for using additional parameters? 
in the XP® algorithm for finding locally envy-free allocations shown by [13, 
Theorem 10]. 


? The algorithm referred to is XP in the cliquewidth of the underlying graph, the 
number of agent types and item types. 

3 XP is the class of parameterized problems that can be solved in time n‘*) for some 
computable function f. 
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Structural Parameters I: Treewidth and Cliquewidth. Informally speaking, the 
parameters treewidth and cliquewidth of graphs quantitatively capture the spar- 
sity and density of the graph by measuring their “likeness” to trees and complete 
graphs. The results we have already for sparse and dense graphs demonstrate 
that these parameters being bounded alone is not enough to obtain tractable 
algorithms. On the other hand, the results of [13] imply that the problem of 
finding complete and locally envy-free allocations admits XP algorithms when 
parameterized by either the treewidth or cliquewidth of the underlying graph 
jointly with the number of item types and agent types. Since their model allows 
for bidirectional edges, these results apply to the setting of undirected graphs as 
well. We note that the algorithms described in [13] focus on complete allocations, 
but can be adapted to account for Pareto efficiency as well. 


Structural Parameters II: Vertex Cover and Twin Cover. In the setting of 
directed graphs and general utilities, we note that the problem of finding a com- 
plete and locally envy-free allocation is NP-hard even when the underlying graph 
is a star. In particular, this demonstrates hardness on graphs with a constant- 
sized vertex cover.* It is not clear if this is the case for undirected graphs and 
binary utilities. We show that the problem of finding locally EEF allocations 
is W{1]-hard when parameterized by the vertex cover number (Theorem 3). We 
remark that a stronger hardness result can be observed for the closely related 
parameter of twin cover? —indeed, the known NP-hardness of finding envy-free 
allocations for binary valuations on complete graphs [3,14] implies hardness for 
graphs that have a twin cover of size zero. 


Few Resources or Agents. We also consider the cases where the number of goods 
or the number of agents are relatively small. When considering these parameters, 
the work of Bliem et al. [6] shows that the computation of EEF allocations is FPT 
when parameterized by the number of goods or the number of agents for additive 
0/1 valuations. In contrast, we show that finding EEF allocations respecting the 
structure of an underlying undirected graph is W/[1]-hard when parameterized 
by the number of goods (Theorem 2). On the other hand, the FPT algorithm 
when parameterized by the number of agents can be extended to account for the 
graph constraints (noted in Observation 1 in the full version [15}). 


Other Notions of Fairness. Finally, we also consider notions of proportionality 
in the context of graphs — we refer to these as local and quasi-global propor- 
tionality concepts, representing the extent to which the definitions account for 
the underlying graph. We demonstrate that computing a locally proportional 
allocation is NP-hard (Theorem 5), while computing a proportional allocation 
that is quasi-global is tractable (Theorem 6). Notions of local proportionality 


4 A vertex cover of a graph is a subset of vertices that contains at least one endpoint 
of every edge. A graph with a bounded vertex cover also has bounded treewidth. 

> A twin cover of a graph is a subset of vertices S such that G\S is a disjoint union 
of cliques, and further, every pair of vertices u,v in any clique of G\S are “twins”, 
that is, N[v] = N{u. 
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have been proposed and studied in several of the papers that were summarized 
in the previous section. 


2 Preliminaries 


We use standard terminology from graph theory and fair division. Unless men- 
tioned otherwise, the graphs we consider are simple and undirected. For a graph 
G = (V,E), consisting of a set V of vertices and a set E of edges, by N(v) we 
denote the neighborhood of vertex v € V , ie., the set W C V of vertices such 
that for each vertex w € W there exists an edge e = (v,w) € E. The closed 
neighborhood of a vertex v is N(v) U {v} and is denoted N[v]. The degree of a 
vertex v, denoted d(v), is |N(v)|. A clique is a subset of vertices which are pair- 
wise adjacent. An independent set is a subset of vertices, no two of which are 
adjacent. For X C V, the induced subgraph G[X] denotes the subgraph whose 
vertex set is X and the edge set consists of all edges whose both end points are 
in X. 

An instance of fair division for indivisible goods consists of n agents A = 
{1,...,n} and m goods (also called items or resources), R = {01,...,Om}. Fur- 
ther, we are also given valuations (also called preference functions or utilities) 
ve : 28 — Z for every agent £ € A. We will assume throughout that the val- 
uation functions are additive, i.e., for each agent € € A and any set of goods 
SCR, ve(S) = > oes ve({o}). A 0/1 valuation is a function that takes values 
in {0,1}, while valuations are said to be identical if every agent has the same 
preference function. In the context of 0/1 valuations, we say that an agent values 
or approves a good if her utility for the good is 1. We will use V to denote the 
valuations of the agents A over R. When considering fair division in the context 
of social networks, we are also given an undirected graph G over the agents A. 

Every subset $ C R is called a bundle. An allocation is a function 7: A — 28 
mapping each agent to the bundle she receives, such that 7(i) M 7(j) = 0 when 
i #j because the items cannot be shared. When U,c, 7a) = R, the allocation 
m is said to be complete, otherwise it is partial. An allocation is non-wasteful if 
every good is allocated to an agent that assigns positive utility to it. 

An allocation 7m’ dominates 7 if for all 2 € A it holds that ve(m(£))) < 
ve(7'(€)) and for some aj € A it holds that va,(7(aj))) < Va,(7(aj)). An 
allocation 7 is Pareto-efficient if there exists no allocation 7’ that dominates 7. 
In the case of 0/1 preferences, we note that an allocation is Pareto-efficient if and 
only if it is complete and non-wasteful, assuming that each resource provides a 
value of 1 to at least one agent. 

Given an instance of fair division (A,R,G = (A,E),V) as described above, 
we now introduce the following fairness notions: 


> Graph Envy-Freeness (GEF). We call allocation 7 graph-envy-free if for 
each pair of (distinct) agents i,j € A such that j € N(i), it holds that 
vi(m(i)) > vilz(j)). 

> Quasi-Global Proportionality (QP). We say that an allocation 7 achieves 


quasi-global proportionality if for each agent 2 € A, vi(m(i)) > aya Vi (R). 
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> Local Proportionality (LP). We say that an allocation 7 achieves local 
é E E : 1 , 
proportionality if for each agent £€ A, vi(m(t)) > aga D> jenqi) Vil7t(5))- 


Note that the graph versions of variants of envy-freeness (such as EF 1 or 
EFX) can be defined analogously in a straightforward manner. It is easy to 
see that any graph envy-free allocation is also locally proportional and that if 
the underlying graph is complete, then local proportionality coincides with the 
standard notion of proportionality. For the problems we consider, we are typically 
given an instance of fair division on a graph, and the goal is to determine if 
there exists an allocation that satisfies some notion of fairness and efficiency. 
For instance, consider the following problems: 


GRAPH ENvy-FREE ALLOCATION (€-GEFA) 

Input: An instance of fair division on a graph 

(A, R, G = (A, E), V). 

Question: Does there exist an envy-free, Pareto-efficient allocation? 


LOCALLY PROPORTIONAL ALLOCATION (€-LPA) 

Input: An instance of fair division on a graph 

(A, R, G = (A, E), V). 

Question: Does there exist a Pareto-efficient allocation that achieves 
local proportionality? 


For any efficiency concept (X) and fairness notion (Y), the X-YA problem 
is defined in a similar fashion. Although our questions are posed as decision 
versions, we note that most of our algorithms can be easily adapted to handle the 
natural “search” version of these problems. We refer the reader to the books [7, 
16] and the article [11] for additional background on fair division. 

A problem parameterized by k is fixed-parameter tractable if it is solvable in 
f(k)|I]O') time for some computable function f and the input size |I| according 
to the problem’s encoding. Informally, W-hard problems are presumably not 
fixed-parameter tractable. The problem of finding a clique on at least k vertices 
is W[1|-hard when parameterized by k. We call a problem para-NP-hard if it 
is NP-hard even for a constant value of the parameter. For a comprehensive 
introduction to the paradigm of parametrized complexity and algorithms, we 
refer the reader to the book [12]. 


3. Envy-Freeness 


3.1 NP-Hardness for Two Agent Types 


In this section, we show that finding €-GEFA allocations is NP-hard even in the 
setting of near-identical binary valuations: in particular, when all agents have 
one of two possible utilities over the items. Note that in the setting of identical 
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binary valuations when the graph G is connected, it is easy to see that all agents 
must value all goods without loss of generality, and that desirable allocations are 
the ones that allocates the same number of goods to each agent, where the goods 
themselves may be arbitrarily chosen. Indeed, it is clear that an allocation with 
equal bundle sizes is E-GEFA. On the other hand, consider a €-GEFA allocation 
that does not allocate bundles of equal size to all agents. Let aj and aj be two 
agents that receive bundles of different size. We can always find two adjacent 
agents on a path from a; to a; who have received bundles of different sizes, 
contradicting envy-freeness. 

We now show that even a slightly more general situation is computationally 
intractable — in particular, if all agents have one of two valuations over the 
goods, the problem of identifying E-GEFA allocations is NP-hard. Due to lack 
of space, the proof is deferred to the full version [15]. 


Theorem 1. The €-GEFA problem is NP-complete even when there are two 
agent types, and further, agents have 0/1 valuations over the goods. 


3.2 W-Hardness Parameterized by Goods 


In this section, we demonstrate the hardness of finding €-GEFA allocations even 
when the number of goods is bounded by showing that the problem is W[1]-hard 
when parameterized by the number of goods. 


Theorem 2. The E-GEFA problem is W[1]-hard when parameterized by the 
number of goods, even when agents have 0/1 valuations over the goods. 


We describe a reduction from the W[1]-hard problem CLIQUE, given a graph 
G and an integer k, does there exist a clique on k vertices in G. Let J = (G,k) 
be an instance of clique, where G = (V,E) and further, V = {vy,...,vn} and 
E = {e1,...,@m}. We assume, without loss of generality, that m > Ci since we 
can always return a trivial NO-instance when this is not the case. We begin by 
describing the construction of the reduced instance Jj := (A,R,H = (A, F),V). 
We define the set of goods R as follows: 


R= (Gi,++ +5 Qe; Pis+++¢PkyGi,<++, deri}, 
where ? = (ah For ease of discussion, we call the first £ goods popular and 


the next k goods specialized. The remaining are dummy items. We now define 
the set of agents as A = VUEUSUW, where: 


S := {S1,.--,$e41} and W := {wi; |i € [n],j © [€+ 1}. 


We indulge in a mild abuse of notation and use v; to refer to both an element 
of V from the clique instance and an agent of A in the reduced instance (similarly 
for edges). The edges of H are as follows: 


> e=(u,v) € E is adjacent to all vertices of S and u,v. 
> vi € V is adjacent to all vertices wi; for j € [€+ 1]. 
t For each 1 <i<n, H[Ustiwis] induces a clique. 
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The preferences of the agents are as 
follows: 


All agents have an utility of 1 for the 
popular goods. 

All agents in V have an utility of 1 for 
the specialized goods. 

The agent sj € S has an utility of 1 for 
di, for allie [@+ 1]. 


This completes the construction of the 
instance (Fig. 1). Note that the number of 
goods is a function of k alone. We now 
turn to the argument for equivalence, a 
clique X C V, of size k exists in G aff, 
there is an GEF allocation for the instance 
constructed Jy := (A, R, H = (A, F), V). 


Proof. In the forward direction, let X C V 
be a clique in G and let Y := G[X] C G. 
Consider now the following allocation 7. 
We let each agent corresponding to Y 
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$1,S$2,°°* , Se, Se41 


Fig.1. A reduced 
instance based on an instance G = 
(V,E),k of CLIque. Recall that 2? 
denotes (Ch and only some vertices of 
W are shown for clarity. The shaded 
vertices induce a complete subgraph. 
The edge e; = (vi, vj) is adjacent only 


sketch of the 


receive one popular item, each agent cor- © Vi and vj among vertices in V. 


responding to X receive one specialized 

item, and finally allocate the item d; to 

s; for alli € [€+ 1]. It is straightforward to verify that the allocation 7 is 
Pareto-efficient and envy-free with respect to H. 


This concludes our description of a fair and complete allocation strategy 
given a clique in G. We now turn to the reverse direction, where we are given an 
allocation 7t that is Pareto-efficient and envy-free with respect to H. It is useful 
to make the following observation about 7t to begin with. 


Claim. Let H be defined as above, and let 7 be an allocation that is Pareto- 
efficient and envy-free with respect to H. Then, any popular good is assigned by 
7m to an agent from E. Further, no agent in E can receive more than one popular 
good in the allocation 7. 


Since 7t is non-wasteful, the specialized goods must be distributed among 
agents corresponding to V. The following is easy to see. 


Claim. Let H be defined as above, and let 7 be an allocation that is Pareto- 
efficient and envy-free with respect to H. No agent in V can receive more than 
one specialized good in the allocation 7. 


Let X C V be the subset of k agents that receive at least one specialized item 
and let Y C E be the subset of ¢ agents that receive at least one popular item 
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with respect to 7. We claim that G[X] is a clique. In particular, we claim that 
every edge of Y has both its endpoints in X. Indeed, suppose not, and let e € Y 
be an edge with at least one endpoint (say v) outside X. Then, v envies e, which 
contradicts our assumption about 7t being envy-free with respect to H. 


3.3. W-Hardness Parameterized by Vertex Cover 


Recall that a vertex cover of a graph G = (V, E) is a subset S C V such that G\S 
is an independent set (i.e., for any pair of vertices u,v € G\S, (u,v) ¢ E). In the 
setting of directed graphs with arbitrary utilities, finding €-GEFA allocations is 
NP-hard even for graphs that have a constant-sized vertex cover. Here, we show 
that in the setting of binary utilities, finding a E-GEFA allocation is W[1]-hard 
when parameterized by the vertex cover of the underlying graph. Due to lack of 
space, the proof is deferred to the full version [15]. 


Theorem 3 (x). The €-GEFA problem is W(1]-hard when parameterized by 
the vertex cover of the underlying graph, even when agents have 0/1 valuations 
over the goods. 


3.4 NP-Hardness on Paths 


To show the hardness of €-GEFA even when the underlying graph is a path, 
we reduce from a variant of SAT called LINEAR SAT (abbreviated LSAT). 
In an LSAT instance, each clause has at most three literals, and further the 
literals of the formula can be sorted such that every clause corresponds to at 
most three consecutive literals in the sorted list, and each clause shares at most 
one of its literals with another clause, in which case this literal is extreme in 
both clauses. The hardness of LSAT was shown in [2]. In fact, by studying the 
reduced instance, one may assume that a “hard” instance of LSAT has the 
following structure: the first 2q clauses have two literals each and are of the 
following form: 


Ai ={si, fi}, Bi = (i, tb 1 <i<a, 


where $j, i, and t; denote literals, while the remaining p clauses have three 
literals each and are mutually disjoint from each other as well as the first 2q 
clauses. For ease of description, we will assume that the LSAT formula that 
we reduce from has this particular structure. We are now ready to describe our 
reduction — in the interest of simplicity, our proof is designed to address the 
case when the graph is a disjoint union of paths, although it is easy to “stitch” 
these components into a single, longer path, as we will explain later. 


Theorem 4. The €-GEFA problem is NP-complete even when the graph 
induced by the agents is a disjoint union of paths, and further, agents have 0/1 
valuations over the goods. 
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Membership in NP is straightforward to check. We focus here on the reduction 
demonstrating hardness. Let @ be an instance of LSAT over variables X := 
{x1,...,Xn} and clauses: 


C:={Ai, Bi,...,Aq,Bq,Ci,---, Cp}, 


as described above. We refer to the first 2q clauses as the coupled clauses 
and the remaining as isolated clauses. We now turn to the construction of the 
reduced instance Jy := (V,R,H,V). We define the set of goods R as Ry U Rc, 
where: 
Ree = iy tag Wry iy cay Mr Big ness Sty 


and: 
Re =10isss +9 0a; Ga, 20%, dp). 


The set of agents V is given by X U C U Y U G U D, where C is denoted 
in the same way as in the LSAT instance, and further: 


N= Asa ten ) Sess Ma 
CaiGe. 64 aD Si 


We now simultaneously describe the structure of the graph H and the pref- 
erences of the agents. 


Assignment Gadgets. For each 1 <i < 
n, add an edge between X; and Y;. The 
agent X; values {x;,Xi, yi}, while Y; values 


the good y; (and nothing else). vi Yi Xi Xi 


Isolated Clause Gadgets. For each 1 < 
i <p, we add an edge between agents C; (D.) © 
and D;. The agent C; values the literal ¢ if 


and only if £ € C; along with dj, while Dj 
values the good d; (and nothing else). 


di 
Coupled Clause Gadgets. For each 1 < Gi (a1) @ 
Di 


i <q, we add an edge between agents A; 

and B;, and also an edge between G; and Aj. Gi si tit si ti 
The agent G; values the good g; (and noth- 
ing else). Agents A; and B; value, respec- 
tively, the goods {gi, si, ti, i} and {s;, ti}. 

This completes the description of the 
construction (Fig.2). We defer the proof of 
equivalence to the full version [15] due to lack of space. 

We remark that it is possible to combine the connected components in the 
reduced instance above by simply introducing “dummy connector agents” that 
each value a corresponding dummy item and nothing else, leading to the following 
consequence. 


Fig. 2. A schematic of the reduced 
instance in the proof of Theorem 4, 
which is a disjoint union of paths. j1 
J2 J3 are the literals of clause C; 
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Corollary 1. The E-GEFA problem is NP-complete even when the graph 
induced by the agents is a path, and further, agents have 0/1 valuations over 
the goods. 


4 Proportionality for Graphs 


4.1 Local Proportionality: NP-hardness 


Theorem 5. The €-LPA problem is NP-complete on undirected graphs, even 
when all agents have 0/1 valuations over the resources. 


Proof. We defer the proof to the full version [15] due to lack of space. 


4.2 Quasi-Global Proportionality: Efficient Algorithms 


To obtain efficient algorithms for finding Pareto-efficient allocations that respect 
quasi-global proportionality, We model the problem of finding Pareto-efficient 
allocations respecting quasi-global proportionality using an integer linear pro- 
gram (ILP) with a structured constraint matrix. In particular, it is well-known 
that if the constraint matrix of an ILP is totally unimodular,®° then the corre- 
sponding instance can be solved in polynomial time. We turn to an explanation 
of our encoding. 


Theorem 6. The problem of finding a Pareto-efficient allocation that is quast- 
globally proportional with respect to an underlying undirected graph on the agents 
can be solved in polynomial time if all agents have 0/1 valuations. 


Proof. Let us assume there are n agents and m goods. We will introduce a vari- 
able xi; which indicates whether agent i gets good j, and aj; indicates whether 
agent i likes good j. These constraints are as follows. 


> We encode the fact that the allocation defined by x is well-defined by intro- 
ducing the following constraint for each good j: 

> For each agent i, let sj be the number of items that have utility 1 for agent 
i. For each agent 1, introduce the following proportionality constraint: 


Tn mH 
F : Si 
Vj ) xij <land Vi) ax > 
= " 7 oO (de $1) 


We let the objective function be ) ;_, ai;xi;. Note that any assignment for 
which this function achieves a value of m is complete and non-wasteful, and 
also respects quasi-global proportionality. It is straightforward to verify that 
the constraint matrix for the ILP described above is totally unimodular for any 
underlying graph H. 


6 A unimodular Matrix is a square integer matrix having determinant +1 or —1. A 
totally unimodular matrix is a matrix for which every square non-singular submatrix 
is unimodular. 
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We remark that the problem of assigning goods in a proportional fashion (for 


any of the notions of proportionality that we have introduced) beyond 0/1 valu- 
ations is NP-hard even when there are only two agents with identical valuations, 
by a standard reduction from PARTITION, with the graph being a singe edge on 
two agents. 
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Abstract. The DEFENSIVE ALLIANCE problem has been studied exten- 
sively during the last twenty years. A set R of vertices of a graph is a 
defensive alliance if, for each element of R, the majority of its neighbours 
are in R. The problem of finding a defensive alliance of minimum size 
in a given graph is NP-hard. Fixed-parameter tractability results have 
been obtained for the solution size and some structural parameters such 
as the vertex cover number and neighbourhood diversity. For the param- 
eter treewidth the problem is W[{1]-hard. However, for the parameters 
pathwidth and feedback vertex set, the question of whether the problem 
is FPT has remained open. In this work we prove that (1) the DEFENSIVE 
ALLIANCE problem is W[1]-hard when parameterized by the pathwidth 
of the input graph, (2) the ExacT DEFENSIVE ALLIANCE problem is 
W(l1]-hard parameterized by a wide range of fairly restrictive structural 
parameters such as the feedback vertex set number, pathwidth, treewidth 
and treedepth and (3) a generalization of the DEFENSIVE ALLIANCE prob- 
lem is W[1]-hard parameterized by the size of a vertex deletion set into 
trees of height at most 6. 


Keywords: Defensive alliance - Parameterized complexity - FPT - 
W(1]-hard - Treewidth 


1 Introduction 


Alliances in graphs were introduced first in 2000 by Kristiansen, Hedetniemi, and 
Hedetniemi [12]. The purpose is to form coalitions of vertices able to defend each 
other from attacks of other vertices (in the case of defensive alliances) or able 
to collaborate to attack non-allied vertices (in the case of offensive alliances). 
Alliances can be formed between nations in a security context, between com- 
panies in a business context, or between people wishing to gather by affin- 
ity. The alliance problems have been studied extensively during last fifteen 
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years [2,7,15,17,18], and generalizations called r-alliances are also studied [16]. 
Throughout this article, G = (V,E) denotes a finite, simple and undirected 
graph of order |V| = n. The subgraph induced by S C V is denoted by G{[S]. 
For a vertex vu € V, we use Ne(v) = {u : (u,v) € E(G)} to denote the (open) 
neighbourhood of vertex v in G, and Ne|v] = Ne(v) U {vu} to denote the closed 
neighbourhood of v. The degree dg(v) of a vertex v € V(G) is |Nc(v)|. For a 
subset S C V(G), we define its closed neighbourhood as Ne[S] = U,es Nelv] 
and its open neighbourhood as Ng(S) = Ng[S] \ S. For a non-empty subset 
S CV and a vertex vu € V(G), Ng(v) denotes the set of neighbours of v in S, 
that is, Ns(v) = {we S : (u,v) € E(G)}. We use dg(v) = |Ns(v)| to denote the 
degree of vertex v in G[S]. The complement of the vertex set S in V is denoted 
by S°. 


Definition 1. A non-empty set R C V is a defensive alliance in G = (V, E) if 
dr(v) +1> dpre(v) for allu € R. 


A vertex uv € Ris said to be protected ifdr(v)+1 > dre(v). Aset R C V is adefen- 
sive alliance if every vertex in R is protected. In this paper, we consider DEFENSIVE 
ALLIANCE and EXACT DEFENSIVE ALLIANCE under structural parameters. We 
define the problems as follows: 


DEFENSIVE ALLIANCE 
Input: An undirected graph G = (V, E) and an integer r > 1. 
Question: Is there a defensive alliance R C V such that |R| < r? 


EXACT DEFENSIVE ALLIANCE 
Input: An undirected graph G = (V, FE) and an integer r > 1. 
Question: Is there a defensive alliance R C V such that |R| =r? 


For standard notations and definitions in graph theory and parameterized 
complexity, we refer to West [19] and Cygan et al. [3], respectively. The graph 
parameters we explicitly use in this paper are feedback vertex set number, path- 
width, treewidth and treedepth. 


Definition 2. For a graph G = (V, E), the parameter feedback vertex set is the 
cardinality of the smallest set S C V(G) such that the graph G — S is a forest 
and it is denoted by fus(G). 


We now review the concept of a tree decomposition, introduced by Robertson 
and Seymour in [14]. Treewidth is a measure of how “tree-like” the graph is. 


Definition 3. [4] A tree decomposition of a graph G = (V, F) isa tree T together 
with a collection of subsets X;, (called bags) of V labeled by the vertices t of T 
such that User X¢ = V and (1) and (2) below hold: 


1. For every edge uv € E(G), there is some t such that {u,v} C X;. 
2. (Interpolation Property) If t is a vertex on the unique path in T from ¢; to 
ta, then Xt, NM Xt, Cc Xt. 
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Definition 4. [4] The width of a tree decomposition is the maximum value of 
|X;| — 1 taken over all the vertices ¢ of the tree T of the decomposition. The 
treewidth tw(G) of a graph G is the minimum width among all possible tree 
decomposition of G. 


Definition 5. If the tree T of a tree decomposition is a path, then we say that 
the tree decomposition is a path decomposition, and use pathwidth in place of 
treewidth. 


A rooted forest is a disjoint union of rooted trees. Given a rooted forest F, 
its transitive closure is a graph H in which V(H) contains all the nodes of the 
rooted forest, and E(H) contain an edge between two vertices only if those two 
vertices form an ancestor-descendant pair in the forest F’. 


Definition 6. The treedepth of a graph G is the minimum height of a rooted 
forest F' whose transitive closure contains the graph G. It is denoted by td(G). 


For the standard concepts in parameterized complexity, see the recent textbook 
by Cygan et al. [3]. 


1.1 Our Main Results 


The goal of this paper is to provide new insight into the complexity of DEFENSIVE 
ALLIANCE parameterized by the structure of the input graph. In this paper, we 
prove the following results: 


— the DEFENSIVE ALLIANCE problem is W[1]-hard parameterized by the path- 
width of the input graph. 

— the Exact DEFENSIVE ALLIANCE problem is W/[1]-hard parameterized by 
any of the following parameters: the feedback vertex set number, treedepth 
and pathwidth of the input graph. 

— a generalization of the DEFENSIVE ALLIANCE problem is W[1]-hard parame- 
terized by the size of a vertex deletion set into trees of height at most 6. 


1.2. Known Results 


Fernau and Raible showed in [6] that the defensive, offensive and powerful 
alliance problems and their global variants are fixed parameter tractable when 
parameterized by solution size k. Kiyomi and Otachi showed in [10], the prob- 
lems of finding smallest alliances of all kinds are fixed-parameter tractable when 
parameteried by the vertex cover number. The problems of finding smallest 
defensive and offensive alliances are also fixed-parameter tractable when parame- 
teried by the neighbourhood diversity [8]. Enciso [5] proved that finding defensive 
and global defensive alliances is fixed parameter tractable when parameterized 
by domino treewidth. Bliem and Woltran [1] proved that deciding if a graph con- 
tains a defensive alliance of size at most & is W[1]-hard when parameterized by 
treewidth of the input graph. This puts it among the few problems that are FPT 
when parameterized by solution size but not when parameterized by treewidth 
(unless FPT = W[1)). 
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2 Hardness Results of Defensive Alliance 


In this section we prove the following theorems: 


Theorem 1. The DEFENSIVE ALLIANCE problem is W[1]-hard parameterized 
by the pathwidth of the input graph. 


Theorem 2. The ExAcT DEFENSIVE ALLIANCE problem is W{1]-hard param- 
eterized by any of the following parameters: the feedback vertex set number, 
treedepth and pathwidth of the input graph. 


We introduce several variants of DEFENSIVE ALLIANCE that we require in our 
proofs. The problem DEFENSIVE ALLIANCEF generalizes DEFENSIVE ALLIANCE 
where some vertices are forced to be outside the solution; these vertices 
are called “forbidden” vertices. This variant can be formalized as follows: 


DEFENSIVE ALLIANCE! 

Input: An undirected graph G = (V, FE), an integer r and a set Vg CV. 
Question: Is there a defensive alliance R C V such that (i) |R| <r, and (ii) 
RNAV =9? 


DEFENSIVE ALLIANCEFN jg a further generalization that, in addition, requires 
some “necessary” vertices to be in R. This variant can be formalized as follows: 


DEFENSIVE ALLIANCE! N 

Input: An undirected graph G = (V, F), an integer r, a set Va C V, anda 
set Vg C V(G). 

Question: Is there a defensive alliance R C V such that (i) |R| < r, (ii) 
RO Vg = 9, and (iii) Va C R? 


While the DEFENSIVE ALLIANCE problem asks for defensive alliance of 
size at most r, we also consider the EXACT DEFENSIVE ALLIANCE 
problem that concerns defensive alliance of size exactly r. Analogously, 
we also define exact versions of the two generalizations of DEFENSIVE 
ALLIANCE presented above. To show W([1]-hardness of DEFENSIVE ALLIANCE, 
we consider the MULTIDIMENSIONAL SUBSET SuM (MSS) _ problem. 


MULTIDIMENSIONAL SUBSET SuM (MSS) 
Input: An integer k, a set S = {s1,...,8,} of vectors with s; € N* for every 
i with 1 <i<n anda target vector t € N*. 
Parameter: k 
Question: Is there a subset S’ C S such that }> s=t? 
ses’ 


We introduce two variants of MSS that we require in our proofs. In the MULTI- 
DIMENSIONAL RELAXED SUBSET SUM (MRSS) problem, an additional inte- 
ger k’ is given (which will be part of the parameter) and we ask whether 


there is a subset S’ C S with |S’| < k’ such that So s > ¢t. It is known 
ses’ 
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that MRSS is W[{l]-hard when parameterized by the combined parameter 
k + k’, even if all integers in the input are given in unary [9]. For EXACT 
MRSS problem, both the input as well as the parameters are the same as in 
the case of MRSS however one now asks whether there is a subset 5’ C S 


with |S’| = k’ such that S> s > ¢t. This variant can be formalized as 
ses’ 


EXACT MULTIDIMENSIONAL RELAXED SUBSET SUM (Exact MRSS) 
Input: An integer k, a set S = {51,...,8,} of vectors with s; € N* for every 
i with 1 <i<nand a target vector t € N*. 

Parameter: k, k’ 


Question: Is there a subset S’ C S with |S’| = k’ such that > s >t? 
ses’ 


Lemma 1. Exact MRSS is W[l]-hard when parameterized by the combined 
parameter & + k’, even if all integers in the input are given in unary. 


This follows from the fact that the MRSS problem is W[1]-hard even if all integers 
in the input are given in unary. We now show that the DEFENSIVE AuiiancelN 
problem is W[1]-hard parameterized by a vertex deletion set to trees of height at 
most four, i.e., asubset D of the vertices of the graph such that every component 
in the graph, after removing D, is a tree of height at most four. 


Lemma 2. The DEFENSIVE ALLIANCE N problem is W(1|-hard parameterized 
by the size of a vertex deletion set into trees of height at most 4. 


Proof. To prove this we reduce from MRSS, which is known to be W/[1]-hard 

when parameterized by the combined parameter k+ k’ [9]. Let I = (k, k’, S,t) be 

an instance of MRSS. We construct an instance I’ = (G,r, Va, Vg) of DEFEN- 

stVE ALLIANCEFN the following way. See Fig.1 for an illustration. First, we 

introduce a set of & new vertices U = {u1,u2,...,ux}. For every u; € U, we 

create a set V,,,9 of >> s(t) one degree forbidden vertices and a set Vy,a of 
ses 


2( SS s(t) — t(i)) one degree necessary vertices; and make u; adjacent to every 
ses 
vertex of V,,,9 UVu,;a. For each vector s = (s(1), s(2),...,5(k)) € S, we intro- 


duce a tree T; into G. Define max(s) = max{s(i)}. We introduce two vertices 
7 


“xs and ys, and introduce two sets of new vertices As = {aj,... + Ornetey and 
B, = {0j,..  Praetay t For each 7 € {1,2,...,k} and for each s € S, we make 
u, adjacent to exactly s(7) many vertices of A, in an arbitrary manner. Next, 
for every vertex a* € A,, we add a set Viz of |Nu(a*)| +3 many one degree 
forbidden vertices adjacent to a*®. The vertex set of tree JT, is defined as follows: 
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Vary 


b 


Fig. 1. The reduction from MRSS to DEFENSIVE AtianceEN in Lemma 2. Note that 
the edges between the vertices in U and A; are not shown. Gadgets in the green square 
correspond to vectors in set S. 


Vit) =A,0B, Uv 


as 
at€As, 


U {es, us}. We now create the edge set of T;, 


E(Ts) ={ (0,0), (Yes A); (@ss¥s) |@ € As, 8 € Be} 


U {(a*, a) |a@e Vey. 


as€A, 


Next, two vertices a, b are introduced into G. Make a adjacent to all the vertices 


k 
of LU) A, UB, and also make a adjacent to b. We define Va = U Vu,a U{a} U 
1 


ses i= 
k k 
U, Vo = {UU VEU Vaso, and set r = 2( d s(i)—x()) + 
s€S ace€As i=, i=1 ses 


n 

>> max(s;) + & + k’ + 1. Observe that if we remove the set U U {a} of K+ 1 
i=1 

vertices from G, each connected component of the resulting graph is a tree with 
height at most 4. 
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Formally, we claim that I is a yes-instance if and only if I’ is a yes-instance. 


Let S$” C S$ be such that |S’| < k’ and >> s > t. Then we claim that the set 
ses’ 


k 
R=Va UY Asv tes} U Bs=UV,aU{a}UU U AsUf{as} U Bs is 
seS\S! i=1 ses! seES\s’ 
a pees alliance in G such that |R| <r, Va C R, and Van R= 9. Let x be 
an arbitrary element of R. 
Case 1: If x = u; € U, then 


dp(ui) = S> s(t) + |Vazal = 55 s(é) +250 s(i) — 24(4 


ses’ ses’ ses 
= ( S~ s(é) - i(i)) + (os - t(i)) + $7 s(é) 
ses! ses Fee 
> (35 3 - 1(i)) + $5 s(i) 
ses ses 
= S- s(t) + ( S- s(t) — t(i)) ar SS s(7) 
seS\S" ses! ara 
> SY s+) os) = So s+ Yao 
s€S\S! ses seES\S’ 


= dre (us) 


Therefore, we have dr(u;) +1 > dre(u;), and hence u; is protected. 

Case 2: If = a* € Ag, then dr(a*) = |Ny(a*)| + |{a,r5}| = |Nu(a*)| +2 and 
dre(a*) = |Vis| = |Nu(a*)| +3. Therefore, we have dr(a*) +1 > dre(a’*). 

Case 8: If « = a, then Nr(a) = UAs U Bs and Nar(a) = 

s€S’ se S\S! 
U B, U A,U {b}. As |A,| = |Bs|, we have dpy(a) + 1 > dre(a). 
seS’ se S\S! 
For the rest of the vertices in R, it is easy to see that they have more neighbours 
in R than in R°. Therefore, I’ is a yes-instance. 
For the reverse direction, suppose that G has a defensive alliance R of size at 

most r such that Va C R and Van ae (. From the definition of Va, we have 


U C R. We know Va contains 1+k+ > 2( So s(t) — t(i)) vertices; thus besides 
oe ses 


the vertices of Va, there are at most 3 max(s;) +k’ vertices in R. Sincea € R, 
i=l 


n 

it must have at least }> max(s;) many neighbours in R from the set YU A,UBs. 
i=l es 

We also observe that if a vertex a® from the set A, is in the SOlition then x, 


also lie in the solution for the protection of a*. This shows that at most k’ many 
sets of the form A, contribute to the solution as otherwise the size of solution 
exceeds r. Therefore, any arbitrary defensive alliance R of size at most r can be 
transformed to another defensive alliance R’ of size at most r as follows: 


=Va. i) ayutaey Lt BE 


as.ER a,€V(G)\R 
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We define a subset S’ = {s EeS|a€ i) Clearly, |.S’| < k’. We claim that 
S> s(t) > t(¢) for all 1 < i < k. Assume for the sake of contradiction that 


\> s(t) < t(é) for some i € {1,2,...,k}. Then, we have 


dr(ui) = S> s(é) + [Vaal = 5° s(t) + 25° s(é) — 2¢(4) 


ses’ ses’ ses 

= S¢ s(t) -t) + 5° s(t) - 4) + 9 (2) 
ses’ ses ses 

< So s(t) -t) + S- 8H) 
ses ses 

= x s(t) + y s(t) — t(i)) + S- s(t) 
sES\S! ses’ ses 


/\ 
M 
A 
[ 
M 
A 
= 
M 
A 
as 
= 


seES\S! ses sES\S’ 
=> dpc (ui) 


and we also know that u; € R’, which is a contradiction to the fact that R’ is a 
defensive alliance. This shows that I is a yes-instance. 


Corollary 1. The DEFENSIVE ALLIANCEFN problem is W([1|-hard parameter- 
ized by the size of a vertex deletion set into trees of height at most 5, even when 
\Va| =1. 


Proof. Given an instance IT = (G,r,Va,Vq) of DEFENSIVE AuuiaNcEFN | 
we construct an equivalent instance I’ = (G’,r’,V,,V4) of DEFENSIVE 
AuLIANCEFN where |\VA| = 1. See Fig.2 for an illustration. Let v1, v2,..., ve 


be vertices of Va. We introduce a necessary vertex « and make x adjacent to all 
the vertices in Va. We introduce a set V,5 of + 1 one degree forbidden vertices 
adjacent to x. For every v; © Va, add a degree one forbidden vertex v;' adjacent 


to v;. We define VA = {x} and V4=VinUVg LU u;. We define G’ as follows: 


VG) =V(G@)U{z}Uvio LU uo 


and 


E(G’) = E(G) J{(a, a), (2, v), (v, v7) | a € Veo, v € Va}. 


Let H be a set with at most k vertices in G such that G — H is a forest with 
trees of height at most 4. Clearly, the set H U {x} is of size at most k +1 and 
G’ — (H U {a}) is a forest with trees of height at most 5. It is easy to see that I 
and I’ are equivalent instances. 


We can get an analogous result for the exact variant. 
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UT 


U2 
| i | | 


Fig. 2. An illustration of the gadget to reduce the number of necessary vertices to one. 


Corollary 2. The Exact DEFenstve AtuANCEEN problem is W([1]-hard 
when parameterized by the size of a vertex deletion set into trees of height 
at most 5, even when |Va| = 1. 


Corollary 3. The DEFENSIVE AuiancelN problem is W{l]-hard when 
parameterized by pathwidth and treedepth of the input graph. 


Proof. To prove this we reduce from MRSS, which is known to be W/[1]-hard 
when parameterized by the combined parameter k + k’ [9]. Let I = (k,k’, S,t) 
be an instance of MRSS. We construct an instance I’ = (G,r, Va, Vg) of DEFEN- 
SIVE ALLIANCEFN as in Lemma 2. Note that the pathwidth and treedepth of a 
tree are at most its height. We claim that as G has a k +1 size vertex deletion 
set D = UU {a} into trees of height at most 4, then G has a path decomposition 
with pathwidth at most k+ 4. First, we get a path decomposition of trees of 
height at most 4, it has pathwidth at most 3; then add k+ 1 vertices of D to all 
the bags to get a path decomposition of G. This implies that the pathwidth of 
G is bounded by k + 4. To compute treedepth of G, note that G \ D is a rooted 
forest where the trees are of height at most 4. We add paths of length at most 
k+1 at the roots, covering the vertices of D, such that the resulting forest is a 
transitive closure of G. The height of the resulting forest is at most k + 5. This 
implies that the treedepth of G is bounded by k + 5. 


Lemma 3. The EXACT DEFENSIVE ALLIANCE!’ problem is W(1|-hard param- 
eterized by the size of a vertex deletion set into trees of height at most 5. 


Proof. To prove this we reduce from EXACT DEFENSIVE Atuance!N when 
|VaA| = 1, which is W[1]-hard when parameterized by the size of a vertex deletion 
set into trees of height at most 5. See Corollary 1. Let I = (G,r, Va, Vg) with 
|VA| = 1 be an instance of EXACT DEFENSIVE AuutaANcEEN, Let n = |V(G)|, 
r <nand Va = {ax}. We construct an instance I’ = (G’,r’,V4) of EXACT 
DEFENSIVE ALLIANCE! problem the following way. See Fig. 3 for an illustration. 

We introduce a set of vertices H = {h1,ha,...,han} and a vertex x’. We 
also add a set H® of one degree forbidden vertices adjacent to x. Similarly, we 
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x 
hon 


Fig. 3. An illustration of the reduction of necessary vertices in EXACT DEFENSIVE 
AuanceN to Exact DEFENSIVE ALLIANCE 
neighbours in G. 


. The vertex « may have additional 


add a set H®’ of one degree forbidden vertices adjacent to x’. We add three new 
forbidden vertices a,b and c. All the vertices in set H are adjacent to 2, 2z’,a,b 
and c. We define G’ as follows: 


V(G') =V(G)UHUH*UH™ Uf{a,b,c,0°} 
and 


B(G@’) = E(G) (J (wh), (2",h), (a,b), (6,4), (AF LJ (eh?) (Jo sh”). 


heH Ae CH® ne! cHe! 


We define V4 = VoU H* UH™ U {a,b,c} and set r’ = r+ 2n +1. Suppose for 
G the size of a vertex deletion set into trees of height at most 4, is k. Deleting 
that set along with vertices {x, x’, a,b,c}, we get a vertex deletion set into trees 
of height at most 4. The size is clearly bounded by k +5. 

We claim that I is a yes-instance if and only if I’ is a yes-instance. Suppose 
there is a defensive alliance R of size exactly r in G such that x € R and 
Von R= 9. It is easy to check that R’ = RU HU {2’} is a defensive alliance of 
size r+ 2n+1 such that V5 R’ = 0. This implies that I’ is a yes-instance. 
To prove the reverse direction of the equivalence, suppose there is a defensive 
alliance R’ of size r’ such that V4 R’ = 0. We claim that x € R’. For the sake 
of contradiction, suppose « ¢ R’. Then no vertex from the set H is part of R’ 
as dp(h) < 2 and dr(h) > 4 for each h € H. This implies that |R’| <n <1’, 
a contradiction. Therefore « € R’. Observe that at least one vertex from H 
must be part of R’ for protection of x in R’. Without loss of generality assume 
that h; € R’. We see that the protection of h, requires x’ to be inside the 
solution. Therefore, x’ is in R’. Now, the protection of x’ requires 2n many 
vertices which can only be contributed by H. It implies that H C R’. We claim 
that R = R’ \ {H,«x} forms a defensive alliance of size exactly r in G. Since 
R'NV(G) = {x}, we only need to show that x is protected R. This is true since 
x looses 2n neighbours from inside and outside the solution in G’. This shows 
that I is a yes-instance. 

In [1], Bliem and Woltran proved that DEFENSIVE ALLIANCE problem is 
W/(1|-hard when parameterized by the treewidth of the input graph by giving a 


reduction from DEFENSIVE ALLIANCEEN 


problem which again they showed to 
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be W[1]-hard when parameterized by the treewidth of the graph. Since we have 
a stronger result that the DEFENSIVE ALLIANCEEN jg W({l1]-hard when parame- 
terized by the pathwidth of the input graph, we now prove that the DEFENSIVE 
ALLIANCEF problem is W[1]-hard when parameterized by the pathwidth of the 
input graph. We obtain the following hardness results using the reduction of 
Lemma 5 given in [1] and see that the resulting graph has bounded pathwidth. 


Lemma 4. The DEFENSIVE ALLIANCE 


terized by the pathwidth of the graph. 


problem is W[1]-hard when parame- 


Now we give an FPT reduction to get rid of forbidden vertices. The same reduc- 
tion holds for the exact version of the problem as well. 


2.1 Proof of Theorem 1 


Proof. To prove Theorem 1 we reduce from the DEFENSIVE ALLIANCEF prob- 
lem, which is W[1]-hard when parameterized by the pathwidth of the input 
graph. See Lemma 4. Let I = (G,r,Vg) be an instance of the DEFENSIVE 
AtuIANCEF problem. We construct an instance I’ = (G',r’) of DEFENSIVE 
ALLIANCE problem the following way. We set r’ = r. For every vertex u € Vp, 
we introduce a set V,, = {u1,U2,..., Uar+2} of 2r + 2 many vertices adjacent to 
u. We also add a new vertex ¢t which is adjacent to all the vertices in LU Vu. 

ueV, 
Clearly, we can see that the pathwidth of G’ is at most the pathwidth of G plus 
two. We claim that I is a yes-instance if and only if I’ is a yes-instance. Suppose 
there is a defensive alliance R of size at most r in G such that Von R = 0. 
Clearly R’ = R is also a defensive alliance of size at most r’ = r in G’. This 
implies that I’ is a yes-instance. 


To prove the reverse direction of the equivalence, suppose there is a defensive 
alliance R’ in G’ of size at most r’ = r. We observe that any defensive alliance 


in G’ containing a vertex from the set Vg LU V,, U {t} is of size at least r+ 1. 
ue V; 

Since |R’| < r’ =r, this implies that R’N (Va U Vi U {t}) = 0. We see that 
uc V; 

R= R’ isa defensive alliance of size at most r such that RN Vp = 9. This shows 


that I is a yes-instance. 


2.2 Proof of Theorem 2 


Proof. To prove Theorem 2 we reduce from EXACT DEFENSIVE ALLIANCEF 
problem, which is W[1]-hard when parameterized by the vertex deletion set into 
trees of height at most 4 of the input graph. See Lemma 3. The reduction here 
is the same as in the proof of Theorem 1. Therefore, the EXACT DEFENSIVE 
ALLIANCE problem is W[1]-hard when parameterized by the vertex deletion 
set into trees of height at most 6. Clearly trees of height at most 6 are triv- 
ially acyclic. Moreover, it is easy to verify that such trees have pathwidth [11] 
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treedepth [13] at most 6, which implies the EXACT DEFENSIVE ALLIANCE prob- 
lem is W[1]|-hard when parameterized by any of the following parameters: the 
feedback vertex set, pathwidth, treewidth and treedepth of the input graph. 


Corollary 4. The DEFENSIVE ALLIANCEN problem with exactly one necessary 
vertex is W[1]-hard when parameterized by the size of a vertex deletion set into 
trees of height at most 6. 


Proof. To prove this we reduce from DEFENSIVE ALLIANCEEN problem with 
|VA| = 1, which is W[1]-hard when parameterized by the size of a vertex deletion 
set into trees of height at most 5. See Corollary 1. The reduction here is the same 
as in the proof of Theorem 1. 


3 Conclusions 


In this paper, we have proved that the DEFENSIVE ALLIANCE problem is W[{1]- 
hard when parameterized by the pathwidth of the input graph and the EXACT 
DEFENSIVE ALLIANCE problem is W[1]|-hard parameterized by a wide range of 
fairly restrictive structural parameters such as the feedback vertex set number, 
pathwidth, treewidth and treedepth of the input graph. The parameterized com- 
plexity of the DEFENSIVE ALLIANCE problem remains unsettled when parame- 
terized by the feedback vertex set number, pathwidth and treedepth of the input 
graph. It would also be interesting to consider the parameterized complexity with 
respect to twin cover and modular width. 
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Abstract. In the field of transportation planning, it is often insufficient 
to model transportation networks by using networks with fixed arc costs. 
There may be additional factors that modify the time or cost of a single 
trip. These include turn prohibitions, fare rebates, and transfer times. 
Each of these factors causes the cost of a portion of the trip to depend 
directly on the previous portion of the trip. This dependence can be 
modeled using arc-dependent networks. In an arc-dependent network, the 
cost of an arc a depends upon the arc used to enter a. In this paper, we 
study the approximability of a number of negative cost cycle problems in 
arc-dependent networks. In a general network, the cost of an arc is a fixed 
constant and part of the input. Arc-dependent networks can be used to 
model several real-world problems, including the turn-penalty shortest 
path problem. Previous literature established that corresponding path 
problems in these networks are NP-hard. We extend that research by 
providing inapproximability results for several of these problems. In [7], it 
was established that a more general form of the shortest path problem in 
arc-dependent networks, known as the quadratic shortest path problem, 
cannot be approximated to within a constant factor. In this paper, we 
strengthen that result by showing NPO PB-completeness. 


1 Introduction 


This paper studies the inapproximability of several problems associated with 
negative cost cycles in arc-dependent networks. Recall that in a traditional net- 
work G = (V,E), we have a set of vertices V = {v1, v2,..., Un}, a set of directed 
arcs E = {e1,€2,.--,€m}, and a cost function c : E — R. The cost of an arc 
is a fixed constant. This is in contrast to arc-dependent networks in which the 
cost of an arc depends upon the arc taken to enter it. The inapproximability 
results obtained in this paper are stronger than similar results obtained for the 
quadratic shortest path problem [7]. 


© Springer Nature Switzerland AG 2022 
N. Balachandran and R. Inkulu (Eds.): CALDAM 2022, LNCS 13179, pp. 292-304, 2022. 
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In addition, the shortest paths do not necessarily need to be simple. Recall 
that a simple path is a path without repeated vertices or repeated arcs. In fact, in 
some cases, non-simple paths are actually shorter than simple paths depending 
on the path taken. In [13], an O(m?-n?*~+) algorithm is provided for solving the 
path-dependent shortest path problem, where k is the number of predecessors 
used to determine the arc cost. Our work, however, focuses exclusively on simple 
paths. Our problem is NP-complete even when the cost of an arc depends on 
only one predecessor [14]. 

Our problem is a restricted version of the quadratic shortest paths prob- 
lem (QSPP) [7,8]. QSPP cannot be approximated to within any constant factor 
unless P = NP [7]. This result applies even if an arc’s cost only depends on 
adjacent arcs. However, in this paper, we are able to obtain stronger inapprox- 
imability results. 

This paper is also concerned with the detection of negative cost cycles in 
arc-dependent networks. In this problem, the goal is to find a simple cycle NC 
such that the cost of NC is negative. Note that the cost of e1, the first arc in 
NC, depends on ex, the last arc in NC. Thus, the negative cost of the cycle 
depends only on the cycle itself and not on how the cycle was reached initially. 
The negative cost cycle (NCC) problem in general is one of the more widely- 
studied problems in theoretical computer science and operations research due to 
its wide applicability in a number of domains. 

The study of arc-dependent networks is motivated by several applications 
including highway engineering. It is desirable to find optimal routes between 
points in a city or even a freeway network. These optimal routes are often mea- 
sured in terms of the time or distance to travel from one point to one or more 
other points. There already exist efficient algorithms for finding such optimal 
routes [2,5]. However, these algorithms assume that there are no restrictions or 
delays at intersections. If there were delays at intersections, it could alter the 
intended optimal routes [16]. This introduces the notion of “turn penalties” at 
each intersection [3]. These penalties can increase the time and/or distance of 
a route based on which turn is taken at intersections. Turn penalties can be 
modeled in arc-dependent networks by having the cost of an arc depend on the 
turn taken from an intersection. 

Arc-dependent networks also have applications in public transportation sys- 
tems. It is common for travelers to receive rebates when transferring from one 
service line to another [12,13]. These fare rebates introduce an additional layer 
of complexity in public transportation. Some routes may result in a discount, 
while other routes might not have a discount. If we use standard shortest path 
algorithms, the optimal route might not be optimal when fare discounts are 
included. 

The principal contributions of this paper are as follows: 


1. Establishing that the shortest negative cost cycle problem is NPO PB- 
complete (see Sect. 3). 

2. Establishing that the longest negative cost cycle problem is NPO PB- 
complete (see Sect. 3). 
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3. A fixed-parameter tractable algorithm for the shortest path problem in terms 
of the number of arcs in the path (see Sect. 4). 

4. Showing that the shortest path problem does not admit a kernel whose size 
is polynomial in the number of arcs in the path (see Sect. 5). 


2 Statement of Problems 


In this section, we describe the structure of arc-dependent networks. We also 
present the network optimization problems that are studied in this paper. 

Let G = (V,E,C) denote a directed network. V is the vertex set with n 
vertices, and E = {e1,€2,...,@m} is the set of arcs. Let s € V be the source 
vertex. The cost structure is represented by the matrix C, where entry C[e;, e;] 
stores the cost of arc e; assuming that e; was entered through arc e;. The matrix 
C has (m+ 1) rows and m columns. The (m+ 1)'” row of C contains the cost 
of arcs that do not have any incoming arcs. We use the phantom arc é€g entering 
s to account for these costs. We refer to G as an arc-dependent network. 

Let P; denote a path (e; — eg —e3 —---— ex), where e; is an arc leaving some 
vertex S, ex is an arc entering some vertex t, and 7 is a positive integer to help 
label the unique path from s to t using arcs e, to ex. It should be noted that there 
can be more than one path from s to t. Note that Cle;,e,] only matters when 
the head of arc e; is the tail of arc e,. Otherwise, e; cannot be the predecessor 
of e,. In cases where the head of e; is not the tail of ex, we define Cle;, ex] = 0. 
As a result, we note that C does not represent the connectivity of G. The cost 
of a path P; between two vertices is given by: cost(P;) = Cleo, e1] + Clei, e2] + 
Soe Clex—1, ex]. 

In this paper, we explore the following optimization problem: 


Definition 1. Shortest Path (SP) problem: Given an arc-dependent network G, 
source vertex s, and target vertex t, what is the arc-dependent simple path from 
s tot in G with the least cost? 


We also explore the problem of finding simple negative cost cycles in arc- 
dependent networks. We call this problem the negative cost cycle problem, and 
it is defined as follows: 


Definition 2. Negative Cost Cycle (NCC) problem: Given an arc-dependent 
network G, does G contain a simple cycle NC’ consisting of arcs e, through 


e; such that: 
k 


Clex, €1| + S- Cle;_1, e:] <0? 


i=2 


Note that the cost of e1, the first arc in NC, depends on ex, the last arc in 
NC. Thus, the negative cost of the cycle depends only on the cycle itself and 
not on how the cycle was reached initially. 

In this paper, we study several variants of the NCC problem. These are the 
following: 
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1. Shortest Negative Cost Cycle (SNCC) problem: Given an arc-dependent net- 
work G, what is the simple arc-dependent negative cycle in G with the fewest 
arcs? 

2. Longest Negative Cost Cycle (LoNCC) problem: Given an arc-dependent net- 
work G, what is the simple arc-dependent negative cycle in G with the most 
arcs? 


3 Computational Complexity of SNCC and LoNCC 


In this section, we show that the shortest negative cost cycle (SNCC) and longest 
negative cost cycle (LoNCC) problems in arc-dependent networks are NPO 
PB-complete [10]. This is done by a reduction from the Minimum Ones and 
Maximum Ones problems respectively. 


Definition 3. Minimum Ones: Given a 3CNF formula ®, what is the minimum 
number of variables assigned to true in any satisfying assignment to &? 


Definition 4. Mazimum Ones: Given a 3CNF formula ®, what is the maximum 
number of variables assigned to true in any satisfying assignment to &? 


Both the Minimum Ones and Maximum Ones problems are known to be 
NPO PB-complete [9]. Let & be a 3CNF formula with n variables and m 
clauses. From &, we create an arc-dependent network G as follows: 


1. Create the vertices xp and zo. 
2. For each variable x; in @&: 

(a) Create the vertex x;, the vertices Vii through We itnsaiaaays and the 
vertices y;, through y; »,- 

(b) Create the ares (#;_1, Vir) and (xj-1,y;,1) with cost 0. 

(c) Create the arc (uit Via) with cost 0 if the preceding arc is ea 1) 
and cost 1 otherwise, and the arc (y;,,¥;) with cost 0 if the preceding 
arc is (2-1, y;) and cost 1 otherwise. 

(d) For each j = 3...m, create the arc (y; ;_1,Y;,;) with cost 0 if the preced- 
ing arc is (y; ;_»,Yj,j;-1) and cost 1 otherwise. 

(e) Create the arc (y;,,,74) with cost 0 if the preceding arc is (Yj 4,15 Yim) 
and cost 1 otherwise. 

(f) For each j = 3...(m+(n+2)-(m-+1)), create the arc (yij_1 Yas) with 
cost 0 if the preceding arc is (yz j-2 Yi. j-1) and cost 1 otherwise. 


(g) Create the are (y*, x;) with cost 0 if the preceding arc is 


i,m+(n+2):(m+1)? 
CMe nen ene ae We pak tates) and cost 1 otherwise. 
3. Create the arc (tp, 2%) with cost 0. 

4. For each clause ¢; in @: 


(a) Create the vertex z;. 
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(b) For each literal x; in ¢;, create the arc (zj-1,y;,;) with cost 0 and the 
are (y; ;,2;) with cost 0 if the preceding are is (z;-1,y;,;) and cost 1 
otherwise. 

(c) For each literal 2; in ¢;, create the arc (z;—1, Yi;) with cost 0 and the 
arc (ui js 2s) with cost 0 if the preceding arc is (2-985) and cost 1 
otherwise. 

5. Create the arc (2m,2o) with cost —1. 


We construct G so that the only arc with negative cost is (zm,o). Thus, a 
negative cost cycle must consist of the arc (2m,%o) and a O-cost path from 2o 
to 2m. The arc costs in a negative cost cycle are defined in a manner such that 
the only possible path uses the vertices x; through x, followed by the vertices 
zo through z,,, with additional intermediate vertices. 

Between each pair of vertices 7;_; and x;, there are two possible 0-cost paths. 
The first path uses (m+ 1+ (n+ 2) -(m+1)) arcs and corresponds to setting 
the variable x; to true. The second path uses (m+ 1) arcs and corresponds to 
setting the variable x; to false. Thus, the path from 2 to x, uses (n-(m+1)+ 
k + (n+ 2)-(m-+1)) arcs, where k is the number of true variables. 

Between each pair of vertices z;_, and z;, there are up to three possible 0- 
cost paths each with two arcs. These paths are constructed such that only paths 
corresponding to true literals can be used in a simple cycle. Thus, the path from 
Z0 tO 2m uses (2-m) arcs. With the additional arcs (xp, 20) and (Zm,2o), the 
cycle will use a total of (k + 1)-(n+2)-(m-+1) arcs. 

We construct G such that any negative cost cycle in G corresponds to choos- 
ing a truth assignment to each variable in @ and ensuring that each clause has 
a true literal. We show that for any k, @ has a satisfying assignment in which 
k; variables set to true if and only if there is a simple negative cycle in G with 
(k+1)-(n+2)-(m+1) arcs. 


Lemma 1. Let & be a 3CNF formula with n variables and m clauses. Let G be 
the corresponding network for ®. For any k, & has a satisfying assignment with 
k variables set to true if and only if there is a simple negative cycle in G with 
(k+1)-(n+2)-(m+1) ares. 


Proof. First assume that @ has a satisfying assignment in which k variables are 
set to true. Let x be such a satisfying truth assignment to ®. From x, we con- 
struct a cycle NC in G as follows: 

We examine each variable x; € x. If x; = true, add the arcs (Gt): 
ARPES eee Geet. to NC. This adds (m+1+(n+2)-(m+1)) 
ares to NC. If x; = false, add the arcs (7-1, 4; 1); (Yia) Yig)> ++) Yim Yim)» 
(Yims ti) to NC. This adds (m+ 1) ares to NC. NC now contains a total of 
(n-(m+1)+k- (n+ 2)-(m-+1)) arcs. We then add the arc (an, 29) to NC. 

We next examine each clause ¢; € &. Since x is a satisfying assignment, at least 
one literal in ¢; is set to true. If this true literal is z;, then add the arcs (z;_1, Yi) 
and (y; ;,2;) to NC. Since a; is true, the vertex y; ; is not already on NC. If this 
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true literal is s2;, then add the arcs (z;-1, Ui ;) and (es z;) to NC. Since x; is 
false, the vertex Vij is not already on NC. We then add the arc (2m, 20) to NC. 
This means NC now has a total of (k + 1) - (m+ 2) - (m+ 1) ares. 

By construction of G and NC, the only arc in NC with non-zero cost is (2m, Zo) 
with cost —1. Thus, NC is a negative cycle with (k + 1) - (n+ 2) - (m+ 1) ares. 

Now assume that G has a simple negative cycle NC. From NC, we construct 
an assignment x to & as follows: For each variable x;, if the arc (ai-1, i) is in 
NC, then set x; = true. Otherwise, set x; = false. By construction, the only 
negative cost arc in G is (2m, 9). Since (2m, Zo) has cost —1, NC cannot include 
any arcs with cost 1. 

Since NC contains an arc entering 79, NC must contain an arc leaving xg. For 
each i = 1,...,n, the only arcs leaving x;_1 are (Gi) and (#j-1, Yi): If 
NC uses the arc (xj-1, i): then the only way to avoid arcs of cost 1 is to use the 
arcs (@i-1, es cee is sy Cae: are x;). Similarly, if NC uses the arc 
(xj_-1, yj, 1), then the only way to avoid arcs with cost 1 is to use the arcs (241, yj 1), 
(Yi as Yia)>-++> Yim—1 Yim) (Yim Ti). Thus, for each i = 1,...,n, the cycle NC 
uses the vertex x;. In particular, NC must contain an arc leaving Zn. 

The only arc leaving x, is (4%, 20). This arc must be on NC. Since NC 
contains an arc entering zo, NC must contain an arc leaving zo. For each 7 = 
1,...,m, the only arcs leaving z;-1 are (zj-1,y;,;) for literals x; in ¢; and 
(1,8) for literals m2; in @;. 

If NC uses the are (z;-1, US) then NC could not have used the arc 
(ti-1, i) previously. Thus, x; = false and the clause ¢; is satisfied. Addi- 
tionally, the only way to leave Yi. ; without using an arc with cost 1 is to use the 
arc (yr jo 2y) 

If NC uses the are (zj-1,y;,;), then NC must have used the are (zj-1, yf) 
previously. Thus, «; = true and the clause ¢, is satisfied. Additionally, the only 
way to leave y; ; without using an arc with cost 1 is to use the arc (Yi 53 25). 

Therefore, x is a satisfying assignment to ®. As previously shown, the number 
of arcs in NC is (k+ 1)-(n+2)-(m-+1), where & is the number of variables 
set to true by x. 


Using Lemma 1, we can show that both the SNCC and LoNCC problems for 
arc-dependent networks are NPO PB-complete. 


Theorem 1. The SNCC problem for arc-dependent networks is NPO PB- 
complete. 


Proof. First, we establish that a PTAS reduction [11] exists from the minimum 
ones problem to the SNCC problem for arc-dependent networks. This will be 
done by establishing the existence of the functions f, g, and a. 


1. The function f: Earlier in this section, we provided a method for constructing 
an arc-dependent network G from a 3CNF formula & in polynomial time. This 
forms the function f required for the PTAS reduction. 
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2. The function g: In the proof of Lemma 1, we provided a method to take a 
simple negative cost cycle NC in G and construct a satisfying assignment to 
®. This forms the function g required for the PTAS reduction. 

3. The function a: Let &* be minimum number of true variables in any satisfying 
assignment to ®. From Lemma 1, G has a negative cost simple cycle with 
(k* +1)-(n+2)-(m+1) ares. Additionally, if G has a simple negative cost 
cycle with fewer arcs, then would have a satisfying assignment with fewer 
true variables. Thus, the SNCC of G has (k* + 1) - (n+ 2)-(m+1) arcs. Let 
a(e) = $+. 

Let NC be a simple negative cost cycle in G with (k + 1)- (n+ 2)-(m+ 
1) arcs. The function g produces a satisfying assignment to ® with k true 
variables. Since we can determine if @ is satisfied by the all false assignment 
in polynomial time, we can assume without loss of generality that k* > 1. If 


(k+1)-(n+2)-(m+1) —¢ : 
(k*+1)-(n+2)-(m+1) <1+a(e= Ht then: 


ko 2k 2 2-(b+1) _ 2-(KR+1)-(n+2)-(m+]) - 2-4) _ 1, 


kX 2-k* > ke $1 (k* +1)-(n+2)-(m+1) — 2 


Thus, we have a PTAS reduction from the minimum ones problem to the 
SNCC problem for arc-dependent networks. 

Since the minimum ones problem is NPO PB-complete [9], the SNCC 
problem for arc-dependent networks is NPO PB-hard. As argued previously, 
the SNCC problem for arc-dependent networks is in NPO PB. Thus, the SNCC 
problem for arc-dependent networks is NPO PB-complete. 


Theorem 2. The LoNCC problem for arc-dependent networks is NPO PB- 
complete. 


The proof of Theorem 2 is similar to the proof of Theorem 1. Note that the 
SNCC problem is a restricted version of QSPP in which the cost of each arc 
depends only on the previous arc and the path must go from a vertex to itself. 
Thus, the inapproximability result for the SNCC problem also applies to QSPP. 
As a result, we have the following corollary: 


Corollary 1. QSPP is NPO PB-hard. 


4 Fixed-Parameter Algorithm for SP 


In this section, we present a Fixed-Parameter Tractable (FPT) algorithm for 
finding a shortest path p in an arc-dependent network G. This algorithm uses 
k, the number of arcs in p, as its parameter. 


4.1 Intuition 


Observe that if k is small, then it is easy to enumerate all possible paths and 
then return the shortest path found. Thus, the length of the path is a natural 


Path and Cycle Problems in Arc-Dependent Networks 299 


parameter for this problem. Let G be an arc-dependent network with initial 
vertex s and target vertex t. Our approach for solving this problem proceeds as 
follows: First, we design a randomized algorithm (Algorithm 4.1) that randomly 
partitions the vertices of G into (k — 1) sets. Then we find the shortest path 
from s to t that uses at most one intermediate vertex from each set. 

Note that if the shortest path has k edges, then it has (k — 1) intermediate 
vertices. Thus, it is possible to construct a partition that assigns each inter- 
mediate vertex to a different set. In Sect. 4.3, we prove that this happens with 
probability at least ser: We then derandomize the algorithm by proving that 
only a limited number of partitionings need to be tested. This results in the 
desired FPT algorithm. 


4.2 Randomized Algorithm 


Let G be an arc-dependent network with initial vertex s and target vertex t. 
The algorithm proceeds by first partitioning the vertices of G into the sets 
S1,...,;Sp—-1. Then, the algorithm finds the shortest path p such that for each 
set of vertices S € {S1,...,S,-1}, at most one vertex from S is on p. We refer 
to such a path as a partitioned path. Note that every partitioned path is simple 
and that every simple path of length k can be made into a partitioned path by 
assigning every intermediate vertex in p to a different set. For a given vertex v; 
in G, let S(u;) € {$1,...,S,%~-1} be the set containing the vertex v;. 

Let us consider a method that constructs the partitioned path backwards, 
starting from the destination t. Finding the shortest one-arc path to t can be done 
by simply looking at the costs of the arcs going into ¢. We can then look further 
back to consider all two-arc partitioned paths to t. As we continue backtracking, 
we need to keep track of the sets containing the intermediate vertices so that we 
do not use multiple vertices from the same set. 

Suppose, for vertices v; and v; and H C {),...,5%-1} \ {S(ui), S(v;)}, we 
know the shortest partitioned path p from v,; to t with predecessor v; such that 
p uses at most one vertex from each set in H. Then, from the perspective of 
further backtracking, it does not matter what order the vertices are visited by 
p. We only need to know that the vertices in the sets in H cannot be used for 
further backtracking. Thus, we only need to know the shortest partitioned path 
for each starting vertex, predecessor vertex, and H C {Sj,...,.S,-1}. This leads 
us to a dynamic programming based algorithm. 

Let P(v;,v;, 1) be the least cost of any path from v; to t with predecessor v; 
and set of used partitions H. Note that the cost of any path from v; to t depends 
only on the vertex v; that precedes v;. Thus, P(v;,v;, H) is well defined. If there 
are no intermediate vertices, then the only possible path from v; to t is the arc 
(u;,t). Thus, P(u,v;,0) = C[(v;, vi), (vi, t)]. We will now show that 


P(vis0),H) = min {Cl(v, vs), (Vi, Up)] + P(vp, vi, H \ {S(vr)})}. 
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Theorem 3. Let G = (V,E,C) be an arc-dependent network. For each v;,v; € 
V and each H C {Sj,...,S¢—-1} \ {S(vi), S(v;) }, 


P(v;,0;,H) = {Cl(vj, i); (v;,¥p)| + Pvp, vi, H \ {S(vr)})}. 


min 
v,S (vu, )EH 
Proof. If |H| = 1, then there is only one set S € H. By definition of P, the only 
intermediate vertex between v; and t must belong to S. Thus, the partitioned 
path from v; to v; is of the form (v;,v;) — (v;,t), where S(v,) = S. The cost of 
this path is 


P(05, 03, ) = Clty, ;), (-,2)] + Clas, 0), (oy, ve) 
= Cl(v;, vi), (vi, Ur)] + Pp is 0). 


Now assume that this holds true for all sets H of size h. Let H’ be a set of 
size (h +1). Let p be the shortest partitioned path from v; to t with predecessor 
v; and set of used partitions H’. We know that some v, such that S(v,) € H’ 
immediately follows v; on p. Thus, p can be broken up into the arc (vj, v;) 
and a path p’ from v, to t. Note that p’ has H’ \ {S(v,.)} as its set of used 
partitions and has v; as its predecessor. This means that the cost of p’ is at least 
P(vp, Vi, A \ {S(vr)}). 

If there is a partitioned path p* from v, to t with set of used partitions 
H' \ {S(v,)} and predecessor v; that is shorter than p’, then the path consisting 
of the arc (v;, v,) followed by p* is a partitioned path from v; to t with set used 
partitions H’ that is shorter than p. Since this violates the optimality of p, p’ 
must be the shortest path from vz to t with set of used partitions H’ \ {S(v,)} 
and predecessor v;. Therefore, the cost of p’ is P(ug, vi, H\{S(v,)}). This means 
that : 


P(v;,0;, H’) = min 
( ood ) vr: S(ur)EH’ 


{Cl(vj, 01), (01, ¥7)] + Plors vi, H \ {S(0r)) }. 


From Theorem 3, P(v;,v;,H) can be found in O(n) time once P is known 
for every pair of vertices and every subset of H. Note that we only need to find 
P(v;,v;, H) when (v;,v;) € E. Thus, there are O(m - 2") possible inputs to P. 
This means that P can be computed in O(m-n- 2") time using a dynamic 
program. Once P is computed, it is easy to see that the shortest partitioned 
path from s to t in G has cost minycys,,...,9,_,} P(s,  H), where _ implies that 
s does not have a preceding arc in the shortest partitioned path from s to t. 

We now provide a randomized algorithm for finding the shortest simple path 
in an arc-dependent network G. This is represented by Algorithm 4.1. Algorithm 
4.1 uses a similar color-coding technique to the one introduced in [1]. Recall that 
P can be computed in O(m-n-2*) time. Thus, Algorithm 4.1 runs in O(m-n-2*) 
time. 
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SP_RAND (arc-dependent network G, integer k) 

1: Create the sets S; through S;,_1. 

2: for (each vertex v; in V \ {s,t}) do 

3: Randomly assign v; to a set S(v;) € {S1,...,S~—-1}.- 

4: Create the function P(v;,v;,H) and define P(vj,v;,0) = Cl[(v;, vi), (vi, t)] for each 
pair of vertices vj, v;. 

5: for (each pair of vertices v;,v; and each H C {Sj,...,S%~1}\ {S(vi), S(v;)}) do 

6: P(vi, vj, H) = ming, :5(v,)eH{Cl(Yj, vi), (Vis Ur)] + Por, vi, A \ {Sor })}- 

7: return mingcys,,....9,_,} P(s,+ 4). 


Algorithm 4.1: Randomized SP Algorithm for Arc-Dependent Networks 


4.3. Proof of Correctness 


We now show that if Algorithm 4.1 returns L, then G has a simple path from 
s to t with total cost LZ using at most k arcs. We also show that if the shortest 
simple path from s to t in G with at most k arcs has total cost L, then Algorithm 
4.1 returns DL with probability at least set: 


Theorem 4. If Algorithm 4.1 returns L, then G has a simple path from s tot 
with total cost L using at most k arcs. 


Proof. Assume Algorithm 4.1 returns L. This means that there exist sets $1 
through S;,_1 such that there exists a partitioned path p from s to t in G with 
total cost L. Note that p is a simple path. Additionally, p has at most (k + 1) 
vertices (s, t, and at most one vertex from each of the (k — 1) partitions). Thus, 
pis a simple path from s to t of total cost L with at most & arcs. 


Theorem 5. [f a shortest simple path from s tot in G with at most k arcs has 
total cost L, then Algorithm 4.1 will return L with probability at least oer. 


Proof. Let p be a shortest simple path from s to t in G with at most & arcs. 
Note that p has total cost L. Observe that if the sets S; through S;_1 are chosen 
so that p is a partitioned path, then by Theorem 3, Algorithm 4.1 will return L. 
We want to find the probability that p is a partitioned path. p is a partitioned 
path if each intermediate vertex v; is assigned to a different set S(v,;). Note that 
there are (k—1)*~! different ways to assign the (k—1) intermediate vertices of p 
to the sets S; through S,_1. Additionally, there are (k — 1)! ways to assign each 
vertex to a unique partition. Therefore, the probability of p being a partitioned 


~1)! 
path is aes > oe. 


4.4 Derandomization 


To obtain an FPT algorithm for finding a shortest path, we will derandomize 
Algorithm 4.1 as described in [4]. This derandomization utilizes (m, k)-perfect 
hash families which are defined as follows: 
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Definition 5. Let S be a set of size m. An (m, k)-perfect hash family is a family 
U of functions F that partition S into S, through S;, such that for any set RC S 
of size k, there exists a function that assigns each element of R to a different 
partition. 


Let p be ashortest path in G that uses at most & arcs. Note that p has (k—1) 
intermediate vertices. Let U be an (n—2,k—1)-perfect hash family of V \ {s, ¢}. 
Then, for some F' € U, every intermediate vertex uv; used by p is assigned to 
a different set S(v;). Note that we can construct an (n — 2,k — 1)-perfect hash 
family for V\{s, t} of size e*—!-(k—1)0°8*) -log(n—2) in O(e*-kO0°8 *) .n-log n) 
time [4]. 

Thus, given P, finding the shortest path from s to tin G with at most k arcs 
can be done as follows: 


1. Construct an (n — 2,k — 1)-perfect hash family U for V \ {s,t}. Note that U 
contains e*—1 . (k — 1)°(°8*) . log(n — 2) partitions of V \ {s,t}. This can be 
done in O(e* - kO°8*) . n- log n) time. 

2. For each F' € U, check if G has a partitioned path from s to t. Using the 
method described previously, this takes O(m -n- 2") time for each of the 
e®-1. (k — 1) 8 *) . log(n — 2) elements of U. 


This algorithm runs in O((2- e)* - kO0°8*) .m-n-logn) time. Thus, this is an 
FPT algorithm for finding a shortest simple path in an arc-dependent network. 

Note that Algorithm 4.1 can be easily modified to find partitioned negative 
cycles. Applying the same derandomization technique will result in FPT algo- 
rithms for the SNCC and LoNCC problems when parameterized by the number 
of arcs in the cycle. 


5 Lower Bound on Kernel Size for the SP Problem 


In this section, we show that the SP problem for arc-dependent networks does 
not have a kernel whose size is polynomial in k, where k is the number of arcs 
in the path. This is done through the use of an OR-distillation [6]. 


Definition 6. Let P and Q be a pair of problems and let t: N > N \ {0} be a 
polynomially bounded function. A t-bounded OR-distillation from P into Q 
is an algorithm that for every s, given as t(s) input strings a1, ..., L4(s) with 
|x;| = 8 for all j: 


1. Runs in polynomial time, and 
2. Outputs a string y of length at most t(s)-logs such that y is a yes instance 
of Q if and only if x; is a yes instance of P for some j € {1,...,t(s)}. 


If any NP-hard problem has a t-bounded OR-distillation, then coNP C 
NP /poly [6]. If coNP C NP/poly, then =F = TIF [15]. Thus, the polynomial 
hierarchy would collapse to the third level. 
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Theorem 6. The SP problem for arc-dependent networks does not have a poly- 
nomial sized kernel unless coNP C NP/poly. 


Proof. We will prove this by showing that if the SP problem for arc-dependent 
networks has a polynomial sized kernel, then there exists a t-bounded OR- 
distillation from the SP problem for arc-dependent networks into itself. 

For each 7, let G; be an arc-dependent network with n vertices and m arcs 
such that, for pair of arcs (v;,v,;) and (vj, vr), |C[(vi, vj), (vj, Ur) || < Cman for a 
fixed integer Cmaz. Note that s = |G,| = m- (m+ log Caz). 

Assume that for some constant c, the SP problem has a kernel of size k°. Let 
t(s) = s°. Note that t(s) is a polynomial. 

For each j = 1...t(s), let G; be an arc-dependent network with n vertices 
and m arcs such that |G;| = s. From these networks, we can create a new arc- 
dependent network G with (¢(s)-(m—2) +2) vertices and t(s)-m ares such that 
G is a disjoint union of Gj,...,G sg) except the vertices s and ¢t in G are used 
to represent the vertices s and t respectively in each G;. 

Observe that no arc in G corresponding to an arc in G; shares vertices with 
an arc in G corresponding to an arc in G,,, where j’ # 7. Thus, any path in G 
corresponds to a path in G, for some j € {1,...,¢(s)}. Consequently, G has a 
path from s to ¢ with total cost L if and only if G; has a has path from s to t¢ 
with total cost L for some j € {1,...,t(s)}. 

Let G’ be a kernel of G such that |G’| < k°. Since k < m< s, |G’| < ko < 
s° = t(s). Additionally, G’ has a path from s to ¢t with total cost ZL if and only 
if G; has a path from s to ¢ with total cost L for some j € {1,...,t(s)}. Thu 
we have a t-bounded OR-distillation from the SP problem for arc-dependent 
networks to itself. This cannot happen unless coNP C NP/poly. 


Dn 


Note that the same t-bounded OR-distillation technique will work for the 
SNCC and LoNCC problems. This gives us the following result: 


Theorem 7. The SNCC and LoNCC problems for arc-dependent networks do 
not have polynomial sized kernels unless coNP C NP/poly. 


6 Conclusion 


In this paper, we presented inapproximability results for several negative cost 
cycle detection problems in arc-dependent networks. Specifically, we discussed 
detecting negative cost cycles in arc-dependent networks with the fewest arcs and 
the most arcs. We showed that the shortest and longest negative cost cycle prob- 
lems are NPO PB-complete. We also designed a Fixed-Parameter Tractable 
algorithm for finding a shortest path in arc-dependent networks. Finally, we 
showed that the shortest path problem in arc-dependent networks is unlikely to 
admit a kernel whose size is polynomial in the number of arcs in the path. The 
inapproximability results in this paper strengthen the inapproximability results 
established for QSPP in [7]. 
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Abstract. Broadcasting is an information dissemination problem in a 
connected network in which one node, called the originator, must dis- 
tribute a message to all other nodes of the network by placing a series of 
calls along the communication lines of the network. The broadcast time 
of a vertex is defined to be the minimum number of time units required 
to broadcast the message to all vertices of the graph (network) from that 
vertex. Finding the broadcast time of any vertex in an arbitrary graph 
is NP-complete. The polynomial time solvability is shown only for cer- 
tain tree-like graphs. In this paper we study the broadcast problem in 
graph of trees where broadcast algorithms for the base graph is known. 
In such graphs we design a linear time constant approximation algorithm 
to determine the broadcast time of any originator in general case. In a 
particular case when the base graph is the hypercube or another min- 
imum broadcast graph (graph with minimum possible broadcast time 
having the smallest number of edges) containing one tree we present a 
linear time exact algorithm to find the broadcast time of any originator 
vertex. When the base graph is the hypercube graph we improve the 
known result by presenting a 1.5-approximation algorithm to find the 
broadcast time of the whole graph which runs in linear time instead of 
known quadratic algorithm. 


1 Introduction 


In today’s world, due to massive parallel processing, processors have become 
faster and more efficient. In recent years, a lot of work has been dedicated to 
studying properties of interconnection networks in order to find the best com- 
munication structures for parallel and distributed computing. One of the main 
problems of information dissemination investigated in this research area is broad- 
casting. The broadcast problem is one in which the knowledge of one processor 
must spread to all other processors in the network. For this problem we can view 
any interconnection network as a connected undirected graph G = (V, E), where 
V is the set of vertices (or processors) and £ is the set of edges (or communi- 
cation lines) of the network. According to [13], the broadcast time problem was 
© Springer Nature Switzerland AG 2022 
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introduced in 1977 by Slater, Cockayne and Hedetniemi. Large sources of infor- 
mation about broadcasting and related problems are survey articles [6, 13,14], 
book [15] and book chapter [9]. 

Formally, broadcasting is the message dissemination problem in a connected 
network in which one informed node, called the originator, must distribute a 
message to all other nodes by placing a series of calls along the communication 
lines of the network. The informed nodes aid the originator in distributing the 
message. This is assumed to take place in discrete time units. The broadcasting 
is to be completed as quickly as possible subject to the following constraints: 


— Each call requires one unit of time. 
— A vertex can participate in only one call per unit of time. 
— Each call involves only two adjacent vertices, a sender and a receiver. 


Given a connected graph G and a message originator, vertex u, the natural 
question is to find the minimum number of time units required to complete 
broadcasting in graph G from vertex u. We define this number as the broadcast 
time of vertex u, denoted b(u, G) or b(u). The broadcast time b(G) of the graph 
G is defined as max{b(u)|u € V}. It is easy to see that for any vertex u in a 
connected graph G with n vertices, b(u) > [logn] (all log’s in the paper are 
base 2), since during each time unit the number of informed vertices can at most 
double. Determining b(u) for an arbitrary originator u in an arbitrary graph G 
has been proved to be NP-complete in [22]. The problem remains NP-Complete 
even for 3-regular planar graphs [18] and for split graphs, the graphs whose 
vertex set can be partitioned into a clique and an independent set [16]. The best 
theoretical upper bound is obtained by the approximation algorithm in [4] which 


produces a broadcast scheme with O( eet 6(G)) rounds. Research in [21] has 
57 
a €. 


showed that the broadcast time cannot be approximated within a factor 3% 

This result has been improved within a factor of 3—€ in [4]. As a result research 
has been made in the direction of finding approximation or heuristic algorithms 
to determine the broadcast time in arbitrary graphs (see [1,2,4,5,7,8, 17,19, 20]). 

Since the broadcast problem in general is very difficult, another direction is 
to design polynomial algorithms for some classes of graphs. The first result in 
this direction was a linear algorithm to determine the broadcast time of any 
tree [22]. Recent research shows that there are polynomial time algorithms for 
the broadcast problem in tree-like graphs where two cycles do not intersect - 
unicyclic graphs, tree of cycles, or in graphs containing no intersecting cliques - 
fully connected trees and tree of cliques [10-12]. No other results are known in 
this area. The broadcasting problem becomes very difficult when graphs contain 
intersecting cycles. The exception is of course the complete graph, where all 
edges are available to broadcast optimally. 

In this paper we consider broadcasting in so called graph of trees, where 
every vertex of the base graph is the root of a tree. Polynomial time broadcast- 
ing algorithms in such graphs becomes approachable when there is a polynomial 
time broadcast algorithm for the base graph. In Sect. 2 we generalize the existing 
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result on hypercube of trees for any minimum broadcast graph with dimensional 
broadcast scheme, and present an exact algorithm which runs in linear time. In 
Sect. 3 we improve an earlier result on hypercube of trees by reducing the algo- 
rithm complexity from quadratic to linear, and also improve the approximation 
ratio from 2 to 1.5. In Sect. 4 we design a linear time constant approximation 
algorithm to determine the broadcast time of any originator for the graph of 
trees when the broadcast algorithm for the base graph is known. 


2 Exact Algorithm When the Base Graph is a k-Regular 
mbgs with One Tree 


In this section we generalize the result presented in [3] for Hypercube of trees with 
one tree, and present a linear algorithm for any minimum broadcast graph (mbg) 
on 2" vertices with one tree. Our new algorithm and the proof of correctness are 
different from the algorithm and correctness proof presented in [3]. 

Let G, be a minimum broadcast graph (mbg) on 2* vertices. Recall that 
mbg is a graph on n vertices with broadcast time [log, n] containing minimum 
possible number of edges. There are three non-isomorphic mbgs known in the 
literature; hypercube H(k), Knédel graph W; 9x and regular circulant graph 
C(4,2*) (see for example [9]). All the three graphs share some graph-theoretic 
and communication properties. In our algorithm below we will use some of these 
properties. All the three graphs are k-regular, with 2" vertices, broadcast time 
equal to k and diameter upper bounded by k. Another property that all the 
three graphs share is the dimensionality, which will be used in our algorithm. 


Property 1. [9]: In any minimum time broadcast scheme there are no idle vertices 
i.e. every vertex that receives the message at time 7, 0 <7 < k—1 has to send the 
message to its neighbors during time units 7+ 1,...,&. Moreover if one informed 
vertex stays idle at time unit 7, for any i, 2 <i < k, then the other 2* —1 vertices 
can finish broadcasting in & + 1 time units for any 2<i<k. 


Let G be one of the three mbgs on 2" vertices described above and denoted 
by H; where r is the root of a tree T. The remaining 2* — 1 root vertices 
do not contain any tree. One can assume that these 2* — 1 trees are empty 
trees, containing only the root of the trees. Let us also assume that r has m 
neighbors in T’, vertices v1, v2, ...,Um- Uj is the root of the subtree T;, 1 <7<m. 
Let us consider b(v;,T;) = t; and without loss of generality we assume that 
t, > tg >... > tm. Then it follows from [22] that b(r,T) = max{i + t,}, where 
1<i<m. Let b(r,T) =7 and 7 > 1 (see Fig. 1). Let us consider the largest 
index j such that 7 =t; +7 for 1 <j <™m. 


2.1 Broadcast Algorithm When Originator is r 


Consider two cases depending on the relationship between 7 and k, the dimension 
of the mbg in G (the broadcast time of mbg). Let all the root vertices will be 
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informed by 7(r) time units. The algorithm A calls another algorithm Broadcast- 
mbg which returns 7(r). When a tree vertex is informed then it follows the 
well-known broadcast algorithm in trees from [22]. 

The algorithm below is given for Knoédel graph W,, 9x. Similar algorithm for 
regular circulant graph O(4,2*) is easy present, but we will skip it because of 
the space limitation. For hypercube H; the simple algorithm is presented in [3] 
(Fig. 2). 


Fig. 1. G is one of three mbgs where r is the root of a tree T. In this case mbg is a 
k-dimensional hypercube 


Fig. 2. G with originator v. The subtree T; is separated from the rest of graph G’ 


Recall that Knédel graph on 2" vertices is a k-regular graph with vertex set 
V = {0,1,...,2* — 1} and set of edges E = {(i,7)|i + 7 = 2? — 1 mod 2*, for all 
p= 1,2, sk}. 


Approximation Algorithms in Graphs with Known Broadcast Time 309 


Broadcast Algorithm A: 
INPUT: G = (V, £), originator r, b(r,T) =7, m, ti > te >... >tm 
OUTPUT: Broadcast time b,4(r) and broadcast scheme for G 
BROADCAST-SCHEME-A(G, r, 7, m, t1 > te >... > tm) 
0. 7 = max{i+t;}, where 1 <i<m 
lIfrt<k 
1.1. r informs another root vertex r; in the first time unit. 
1.2. T(r) = BROADCAST-MBG(G, r;, 1). 
1.3. For each time unit i= 2 tom+1 
1.3.1. r informs tree vertex v;_1. 
2.lf7>k 
21.Ifr>k+m 
2.1.1. For each time unit i = 1 tom 
2.1.1.1. r informs tree vertex v;. 
2.1.2. For each time unit i=m+l1tom+k 
2.1.2.1. an informed root vertex informs another uninformed root 
vertex using any shortest path. 
2.2.Ifk+m—-1>7r>k+4+1 
Let j be the largest index such that 7 = t; + 7 
2.2.1. For each time unit 7 = 1 to 7 
2.2.1.1. r informs tree vertex vj. 
2.2.2. At time unit 7 +1, r informs another root vertex 11. 
2.2.3. T(r) = BROADCAST-MBG(G, ri, 7 + 1). 
2.2.4. For each time unit i=7+2tom-+4+1 
2.2.4.1. r informs tree vertex v;_1. 
2.2.5. If H, is informed by time 7 
then OUTPUT: ba(r) 
else FOLLOW steps 1.1 to 1.3 
3. TREE-BROADCAST(w;, T;) for 1 <i <m. 


Broadcast-mbg: Knodel graph W,, 9x 
INPUT: G = (V, £), originator 7), time at which r; is informed: t,, 
OUTPUT: r(r) 
BROADCAST-MBG(G, 11, ty, ) 
1. Assume r; = 2" —1 
2. For each time unit i=t,, +1 tot,, +k—-1 
2.1. For all 0,1,...,2* — 2 do in parallel 
2.1.1. 4 sends to-2! — 1—4 mod 2", for 1=1,2,2.,6k—1 
3. For all 0,1,...,2k — 2 except 2" — 1 do in parallel 
3.1. i sends the message to 2* — 1 — i mod 2* 
4. Return t,, +k 


Algorithm Complexity: 
Broadcast-mbg takes O(log 2") = O(k) time to inform the root vertices. 
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Algorithm A: Step 0 takes m time to calculate 7. Steps 1.1 and 1.3 take constant 
time to run. Step 2.1.2 can be completed in O(k) time. Also steps 2.1.1 and 
2.2 run in constant time. Again, the tree broadcast algorithm in step 3 takes 
O(|V| — 2*) = O(|Vr|) time to run, where |Vr| is the number of tree vertices in 
G. Thus, complexity of algorithm is O(|Vr| +k) = O(|V)). 


Theorem 1. Algorithm A always generates the minimum broadcast time b(r). 


Proof. The proof is easy for 7 < k and for 7 > m-+k. When 7 < & then line 1.2 
will require / more time after time 1 to broadcast within the mbg, and following 
line 1.3 it will take 7 more time units after time 1, to complete broadcasting 
within the tree T. Since, 7 < k the algorithm will output ba(r) = k +1. From 
Property 1 it follows that b(r,G) >k+1. 

If7 >m-+k, by line 2.1. vertex r makes no delay in broadcasting within tree 
T, thus, all vertices of T will be informed by time 7. By line 2.1.2 of algorithm 
A all mbg vertices will be informed by time m+k. Since tT > m+k then the 
algorithm outputs b4(r) = 7. But b(r) > 7 is an obvious lower bound on b(r). 

It remains to consider the case k+ 1 < 7 < m+k-—1. Following the 
algorithm, the originator r informs the tree vertices at time units 1, ...,9,7+2,j+ 
3,...,m-+ 1 and informs a root vertex at time unit 7. The algorithm completes 
broadcasting in mbg graph at time units 7 + 1,7 + 2,...,73 +k+1. Then, vertex 
r completes broadcasting in tree T in ba(r,T) = max{t; + 1,...,t) +9, tj41 + 
(j + 2),...,tm + (m+ 1)}. Since 7 was the maximum value for which t; + j =T, 
then tj41 + (j + 2) <7, tj42 + (§ +3) <7, «2 tm + (mM+4+1) < 7}. Therefore, 
balr,d) = marty + 1h ty tot + G+ 2), sta + (e+ 1) = 7}. Thos, 
ba(r,G) = mar{jt+k+1,r}. If 7 > 7 +k+1 then ba(r,G) = 7, which is 
an obvious lower bound on b(r) = b(r,G). The last case, if 7 < j +k then the 
algorithm will generate broadcast time ba(r,G) = 7 + 1. To prove the lower 
bound b(r,G) > 7+ 1 in this case, assume by contradiction that b(r,G) = rT. 
Then, originator r had to inform its 7 neighbors in tree T’, then inform all its 
neighbors in mbg to be able to complete broadcasting in G within j + k time 
units. It follows then that r does not have more neighbors in T’, and so m = k. 
Then, since b(r,G) = 7 = j+k = m+k. However, when tT = m+k the algorithm 
follows line 2.1 considered above. Thus, b(r,G) = 7 is impossible in this case. 
Therefore, b(r,G) > 7+ 1, which is the output of algorithm A. 


Broadcasting from a Root Vertex Other Than r: Let us assume that a 
root vertex u is at a distance d from vertex r, where k > d > 1. The algorithm D 
in G starts by informing along the path uF (the shortest among all paths between 
u and r). r receives the message at time d, and then it sends the message to the 
tree attached to it. 


Broadcast Algorithm D: 

INPUT: G = (V, £), originator u, b(r,T) = 7 

OUTPUT: Broadcast time bp(u) and broadcast scheme for G 
BROADCAST-SCHEME-D(G, u, 7) 
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1. u informs along the path ur (the shortest among all paths between u and 
r) in the first time unit. 

2. u continues to inform the other root vertices using any shortest path. 
r receives the message at time d. 

3. TREE-BROADCAST(r, T). 


Complexity Analysis: 

Steps 1 and 2 can be completed in O(k) time. The tree broadcast algorithm 
in step 3 takes O(|Vr|)time to run. Complexity of algorithm is O(|Vr| + k)= 
O(|V)). 


Broadcasting from a Tree Vertex: Broadcast algorithm Q is similar to algo- 
rithm above, and the broadcast tree of G’ originated at vertex r will be obtained 
from algorithm A. 


Broadcast Algorithm Q: 
INPUT: G’, originator v in subtree T;, 7, Tm—1, m—1, ty > te >... > tm—1 
OUTPUT: Broadcast time bg(v) and broadcast scheme for G 
BROADCAST-SCHEME-Q(G’, VU, T;, Tm—1, ™ — 1, rT, ty = to = oes = tm—1) 
1. To: = BROADCAST-SCHEME-A(G’, 7, Tm-1, M—1, t1 > tp >... > 
tm—1) 
2. Attach Tg with T; by the bridge (r,v;) and let the resulting tree be 
labelled as Ty. 
3. TREE-BROADCAST(», T,). 


Algorithm Complexity: Finding the broadcast time of a tree vertex in an 
arbitrary mbg of trees with one tree is equivalent to solving two problems: (1) 
Finding the broadcast time of a root vertex in an mbg of trees with one tree. 
As discussed before the complexity of this algorithm is linear. (2) Finding the 
broadcast time of a tree vertex in a tree which is also linear. 

The correctness of the above two cases are similar to the proof of Theorem 1. 


3 Linear Time 1.5-Approximation Algorithm for General 
Hypercube of Trees 


Recall that [4] presents a O(|V]|) time 2-approximation algorithm for arbitrary 
originator vertex in any hypercube of trees G = (V, E) containing up to 2” trees 
rooted at the vertices of the k-dimensional hypercube. It is clear that to get a 
2-approximation algorithm of the whole hypercube of trees G we have to run the 
O(|V]) algorithm |V| times from any originator, and then take the maximum of 
all broadcast times. The obtained algorithm will be 2 approximation and will 
run in O(|V|*) time. In this section we will present a O(|V|) algorithm which 
gives 1.5-approximation for the broadcast time of any hypercube of trees G. 
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Our improvement in this section is twofold, we improve the complexity from 
O(|V|?) to O(|V|) and we improve the approximation ratio from 2 to 1.5. 

Assume graph G is an arbitrary hypercube of 2* trees T; rooted at r;, i = 
1,2,...,2*, where the base graph is forming a hypercube of dimension k, Hy. 
Denote by h,; the height of tree T; rooted at r; for all 7 = 1, 2,..., 2*. Denote the 
maximum value of the heights of all 2” trees by h, and the maximum value of 
the trees rooted at the hypercube vertices by t. More formally, h = max{h,;|1 < 
i < 2*}, and t = max{b(r;,7;)|1 < i < 2*}. Our algorithm is very simple, for 
all 1 = 1,2,...,2k we find both, its height h; and the broadcast time b(r;,T;) in 
O(|V|) time using well-known)(|V|) time algorithm for trees. And then in the 
same O(|V|) time we will find the values of k, h and t defined above. 


Theorem 2. Broadcast time of graph G, ((G) <h+k+t, the above algorithm 
has 1.5-approximation ratio and runs in (|V|) time. 


Proof. The fact that our algorithm runs in O(|V]) is easy to see. Now, let’s first 
prove that the broadcast time of any originator u in graph G is upper bounded 
by h+k-+t, b(u,G) <h+k-+t. Assume vertex u is in tree T;, then broadcasting 
from u will inform r;, the root of T; by time unit h;. Then within the next 
k time units r; will inform all 2* vertices of the hypercube H;. Thus, after at 
most h; +k time units all hypercube vertices will be informed. Then, in the next 
t = max{b(r;,T;)|1 < i < 2*} time units the hypercube vertices will inform all 
vertices in their respective trees. Thus, b(u,G) <h; +k+t<h+k-+t. Note 
that, 0 < h; < hand 0 < b(r;,T7) < t, and the bound is correct for the cases 
h; = 0 or b(r;,T7) = 0. This covers the case when the originator is r; or a tree 
T; is empty with b(r;,T 7) = 0. 

To get a lower bound on b(u,G) note that if originator u belongs to a tree 
T; with height h then any broadcast scheme from originator u will take at least 
h+k time units, b(u,G) >h+k. Also, if the originator w is a root vertex 1; 
which has distance k from the root of tree T, with b(r,,T,) = t, then clearly 
b(w, G) > k +¢t. Combining these two inequalities and applying b(G) > b(u, G) 
and b(G) > b(w,G) we get 20(G) >h+k+k-+t. Thus, (G) >k+$(h+t). 

Ifh+t < 2k, then 2k = h+t+4 2x for some x > 0, and we get that 
ba(G) — _ht+k+t kt+2k—2@  _ 3k=2e — 3k-1.5x _ 3 


(G) = k+E(h+t) ~~ k+$(2Qk—2a) — k+k—-x 2k-2 2° 
Ifh+t > 2k, then consider broadcasting from originator u which is in a tree 
T, and dist(u,rs) = h. Then, b(u,G) > h+t since broadcasting from vertex u 
has to go through vertex r, and all vertices of a tree T; with b(r;,7;) = ¢ must 
get informed by time b(u, G). Thus, since b(u, G) < b(G) for any originator u we 


: ba(G) © htktte — htt+"t 3 
will get aa) S < 


—- htt — h+t 2° 


4 Linear Time Constant Approximation Algorithm 
in Graph of Trees with Known Broadcast Time 
of the Base Graph 


Assume that we have an arbitrary graph H where its vertices are the roots of 
some trees. We call the resulting graph arbitrary graph of trees G (see Fig. 3). 
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Definition 1. Consider an arbitrary graph H = (Vq,Ex#) where k of its ver- 
tices, denoted as V, are the roots of the trees T; = (Vi, Ei) for 1 <i<k and 
k < |Viz|. We define the arbitrary graph of trees, G = (V, E), to be a graph where 
V=V,UWU...UV,U(Va — V,) and FE = EU EyU... UE, UE. The vertices 
in H denoted as r; will be called root vertices, where 1 <i < |Vy|. The rest of 
the vertices will be called tree vertices. 


Fig. 3. Arbitrary graph of trees G 


The first approximation algorithm is simple. When the originator is a root 
vertex r, then our algorithm S in G starts by informing all the vertices of H. 
When all the vertices in H are informed, each root vertex informs the tree 
attached to it. 

When the originator vertex w belongs to some tree, then the algorithm S' 
in G starts by informing along the path wr. When r receives the message, the 
scheme informs all the vertices of H. When a tree vertex is informed then it 
follows the well known broadcast algorithm in trees [22], called Ar. 


Tree Broadcast Algorithm Ar: 
INPUT: originator r; and tree rooted at r;: T; 
OUTPUT: Broadcast time b4,(ri, Ti) 
TREE-BROADCAST(r;, T;) 
1. r; informs a child vertex in T; that has the maximum broadcast time in the 
subtree rooted at it. 
2. Let ay,..., af be the broadcast times of the f subtrees rooted at r; and 
a, >... > ay. Then, b4,.(ri,T;) = max{j+a;} for l<j< f. 
Approximation Algorithm 9S: 
INPUT: G = (V, £) and any originator « 
OUTPUT: Broadcast time bs(x%) and broadcast scheme for G 
BROADCAST-SCHEME-S(G, x) 
l.Iifc=w 
1.1. w broadcasts along the shortest path Wr in time unit 1. 
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r gets informed at time d. 
1.2. Starting at time d+ 1 onwards inform all the vertices in H. 
2.Ife=r 
2.1. Inform all the vertices in H. 
3. TREE-BROADCAST(r;, T;) for 1 <i < m. (m is the degree of r in tree T) 


Algorithm Complexity: 

Steps 1.2 and 2.1 take O(m) time to inform the vertices in H. In steps 1.1 
and 3, the tree broadcast algorithm takes O(|V| — m) = O(|Vr|) time to run. 
Complexity of the algorithm is O(|Vr| +m) = O(|V]). 


Proposition 1. [f there is an exact algorithm for broadcast time from any orig- 
inator of graph H, then algorithm S is a 2-approximation in graph G. 


Proof. We skip the proof of the proposition because it is very similar to the 
proof of Theorem 3 presented next. 


Theorem 3. If there is a c-approzimation algorithm for the broadcast time prob- 
lem in H from any originator for some constant c > 1, then algorithm S is a 
2c-approximation for any originator in graph G. 


Proof. When the broadcast originator u is in base graph H, u = ro: 
Following our algorithm S originator u will follow the c-approximation algorithm, 
call it C, within the base graph H, then by time unit bco(u, H) all vertices of 
base graph H will be informed, and they will complete broadcasting within their 
respective trees by time unit bo(u, H) + max{b(r;, T;)|i = 1, 2,...,|Va|}. Let us 
assume that max{b(r;,T;) = t, then we have bs(u,G) < bc(u, H) +t. On the 
other hand, it is obvious that bs(u,G) > b(H) and also bs(u,G) > t since all 
the vertices of graph H or the vertices of the tree, say T; with b(7;) = t, must 
be informed by the time unit bg(u,G). Thus, bs(u,G) > setae and we get, 


bs(u,G) bo(u,H)+t _ 5 bc(u,H)+t cb(u,H)+ct __ : 
Bu, < HE = 2 butt 2G = 2c since c > 1 and also 


algorithm C' was a c-approximation algorithm in the base graph H. 


When the originator u is a vertex in tree 7; rooted at vertex r; € Vy: 
As above, following algorithm S the originator u from the tree 7; first will 
inform its root vertex r; by direct path of length, say h;, then will follow the 
c-approximation algorithm within the base graph H, and then starting at time 
unit hi+bc(r:, H) all vertices of the base graph H will start broadcasting within 
their respective trees. Thus, bc(u,G) < hi + be(ri, H) + max{b(r;,T;)|7 = 
1,2,...,|Va|}. Similarly, for the lower bounds, it is clear that bs(u,G) > 
hy + be(ri,H) and bs(u,G) > hy + max{b(r;,T;)|j = 1,2,...,|Va|}. Suppose 
that max{b(r;,T;)|7 = 1,2,...,|Va|} = t. 
Thus, bs(u,G) > hy + bere HI) FE and finally we get that bst3G) 


b(u,G) = 
hitba(ri,H)+t chi+cb(ri,H)+ct . . . . . 
hp rE <2 Bh, tb(re Ht < 2c, since algorithm C is a c-approximation 


algorithm for base graph H, and c > 1. 
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Observation: Theorem 3 and Proposition 1 can be applied to many graphs 
known in the literature. In particular, for m-dimensional butterfly network BF,, 
[14], the m-dimensional shuffle-exchange graph SE,,, [14] and the m-dimensional 
DeBruijn graph DB,,, constant approximation algorithms are known. Proposition 
1 can be applied to hypercube [3], cube-connected cycle [14] and fully connected 
tree [12]. 


5 Conclusion and Future Work 


The broadcast problem, more precisely, finding the broadcast time of any ver- 
tex in an arbitrary connected graph is very difficult. It remains NP-complete 
even for 3-regular planar graphs and for split graphs, graphs whose vertex set 
can be partitioned into a clique and an independent set. The broadcast problem 
is shown to be NP-hard to approximate within a factor 3 — «. The best known 
ey). A long stand- 
ing open problem is to present a constant approximation algorithm or to prove 
that it is NP-hard to approximate within a constant factor. Polynomial time 
algorithms for the broadcast problem are only known for some tree like graphs. 
In particular, there exist linear algorithms for trees, tree of cycles and necklace 
graphs, more generally in graphs where two cycles intersect in at most one ver- 
tex. Tree of cliques is the only graph where two cycles intersect in many vertices 
but there is a O(n log log n) algorithm. However, it is a special case since in the 
clique all edges are available to be used for optimal broadcasting. 

In this paper we consider graph of trees for which the exact or approxima- 
tion broadcast algorithm for the base graph is known. First, we generalize the 
existing result on hypercube of trees for any minimum broadcast graph, and 
present an exact algorithm which runs in linear time. Next, we improve an ear- 
lier result on hypercube of trees by reducing both the algorithm complexity and 
the approximation ratio. The last result is a linear time 2-approximation algo- 
rithm to determine the broadcast time of any originator for the graph of trees 
when the broadcast algorithm for the base graph is known. 

The immediate future work will be to generalize the result for mbgs on 2* 
vertices to any broadcast graph (bg) on any number of vertices. Here the main 
difficulty is to prove a property similar to Property 1 from Sect. 2. Recall that 
bg on n vertices is graph with broadcast time [log n] from any originator. 


approximation for broadcasting in general graphs is O( 
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